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Chapter 1 


PREDICTING THE PAST: CORRELATION, EXPLANATION, 
AND THE USE OF ARCHAEOLOGICAL MODELS 


Lynne Sebastian and W. James Judge 


MODELS AND ARCHAEOLOGY 


One of the more interesting developments in the field of archacology in the 
recent past is the emergence of predictive modeling as an integral component of the 
discipline. Within any developing and expanding field, one may expect some initial 
controversy that will, presumably, diminish as the techniques are tested, refined, 
and finally accepted. We are still very much in the mitial stages of learning how to 
go about using predictive modeling in archacology, and this book represents an 
effort by some of the leading experts in the field to present a comprehensive and 
detailed examination of this approach to understanding how people in the past used 
the landscape in which they lived. 


There are probably as many definitions of the term mode/ as there are screntific 
disciplines; several will be suggested in subsequent chapters of this book. We would 
like to offer a definition presented by David Clarke, who noted that models are 
“hypotheses or sets of hypotheses which simplify complex observations whilst 
offering a largely accurate predictive framework structuring these observations” 
(1968:32). There are two key aspects of this definition. The first is that models are 
selective abstractions, which of necessity omit a great deal of the complexity of the 

real world. Those aspects of the real world selected for inclusion in a model are 

assumed to be significant with respect to the interests and problem orientation of 
the person constructing the model. This 1s an important concept, since it indicates 
that there is no such thing as a truly objective model, be it inductively or 
deductively generated. Thus all models reflect, to a considerable degree, subjectiv- 
ity on the part of the observer. 


The second key aspect has to do with the predictive capability of models. Note 
that by this definition models have predictive content, and thus the term predictive 
modeling is somewhat redundant. We will employ this term here, however, since it 
has been widely accepted im archacology. 
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This emphasis on the predictive aspects of models brings us to a more detailed 
examunation of the concept of prediction uself, which the dictionary defines as “the 
ability to foretell on the basis of observation, expenence, or scientific reason.” One 
might even say that prediction 1s the essence of science because it allows us to 
formulate expectations about the future state of a system that are based on our 
knowledge of such systems or of similar ones (i-c., models). The pot is that 
prediction 1s important, and that it 1s achieved scientifically through the generation 
of hypotheses that can be tested against the empirical record. Thus the method of 
prediction 1s essentially a deductive process, regardless of the form of generation of 
the model itself. Although the degree of formality might vary considerably, nearly 
all archaeological research today 1s based on a fundamentally deductive methodology. 


Verification of formal predictive statements (hypotheses) through empirical 
testing against the archacological record frequently involves techniques of sam- 
pling. In one sense all archacology involves sampling, since we are never confronted 
with the complete record of past human behavior. Realizing this, archacologists 
distinguish between relative degrees of sampling, as in “100 percent inventory” vs 
“sample survey.” In this case, even though the results of both surveys are acknow!- 
edged to be samples, the latter term refers to a formally articulated, specific 
sampling strategy that guides the character of the inventory. 

We mention sampling at this point because im the past formal sampling has 
frequently been confused with, and at times even identified with, predictive 
modeling; in the eyes of some, the implementation of a sampling design actually 
constitutes predictive modeling. Unfortunately, this confusion of sampling and 
predictive modeling has led to erroneous interpretations of the capabilities of the 
latter. Some researchers have even assumed that simply by adopting formal sam- 
pling techmiques they would be able to predict archaeological site loci and thus 
satisfy legal compliance requirements without having to undertake expensive, 100 
percent inventory surveys. 

We would emphasize that sampling and predictive modeling are not the same 
thing and that formal sampling is neither required by predictive modeling nor 
limited to that approach. Sampling is simply one method of verifying testable 
hypotheses (albeit a very umportant one). In the strict sense —i.c., as a technique of 
data acquisition — formal sampling is no more (or less) related to or important to the 
modeling process than 1s 100 percent inventory survey. 

One of the most unfortunate results of this confusion is that land-managing 
officials are at times led to belheve that it is relatively easy to predict where all sites 
should be, and that by sampling a few of the predicted sites the archacologists can 
do their jobs while saving themselves time and effort and saving the taxpayers a 
great deal of money. Realizing the distinction between sampling arid prediction is a 
valuable first step in understanding how very complex the process of predictive 
modeling really 1s. 

Both archaeologists and managers can and should be mtcrested in refining 
attempts to model human behavior and in refining the sampling techniques used to 








PREDICTING THE PAST 


gather the data needed to verify such models. But neaher models nor sampling 
should be viewed as a panacea destined to solve all the problems of management of 
archaeological resources and of compliance with existing legislation. This 1s a 
met ical fact of life that will be demonstrated repeatedly throughout this 
book. 


THE PROBLEM OF EXPLANATION 


Explanation in Archaeology 


In the process of maturation, perhaps all scientific disciplines pass from a 
basically descriptive stage to a stage in which true explanation is attempted—a 
process of development that is sometimes painful and often divisive. The archaco- 
logical profession has been experiencing this transition for the past two decades, and 
the process has been both difficult and variably successful. 


Twenty years ago, archacology was a discipline in which most of the activity 
was directed toward describing the data that we recover. Since that time archacolo- 
gists have increasingly made conscious and consistent attempts to explain the 
changes in cultural process that were documented during the prior descriptive 
phase of archacological research. It 1s obvious that such documentation must take 
place before explanation can be sought, but it is equally apparent that a discipline 
such as archaeology cannot remain at the descriptive level if it 1s to realize its full 
potential in contributing to scientific understanding. 

Thus archaeologists who are undertaking the inventory and excavation of 
archacological resources today are not simply concerned with accurately describing 
the artifacts and other data they find; they are equally concerned with placing those 
data in the context of explanation. That is, once they have determined what the 
things they recover are (or, more accurately, were) and how those things changed 
through time, they become interested in determining why such changes took place, 
in the explanation of such changes. In terms of the current jargon of our profession, 
we have progressed from dealing strictly with the archaeological context of the data 
to exploring their systemic context and finding means of linking the two realms. 


As it has matured, archaeology has changed from a descriptive, documentary 
discipline to one that attempts to understand certain aspects of human behavior 
with reference to independent events and variables known to have occurred in the 
past. It ss this attempt to understand human behavior that has given archacology a 
new direction—a new sense of purpose, perhaps. Some would even say that this 
effectively legitimizes archacology as a profession that is dependent in large part on 
public funding, but such a statement would evoke considerable argument among 
archaeologists themselves. In any case, most archacologists would agree that we 
have progressed as a discipline, and that the new sep: + of purpose arising from 
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explanatory research emphases should be of concern to both archacologists and land 
managers. 

It ss an the context of this transition from description to explanation that an 
important dichotomy apparent in this book arises. Those who read large sections of 
this book rather than using specific parts as a reference volume will soon notice that 
some authors focus on models that are deductively derived and attempt to predict 
how particular patterns of human land use will be reflected in the archacological 
record while others are working with inductively derived models that identify and 
quantify relationships between archacological site locations and environmental 
variables. The latter models, which we term correlatire, are by far the more common 
in current modeling practice. It 1s our contention (and one that 1s shared by some 
but not all of the volume authors) that this emphasis on descriptive models will and 
should eventually be replaced by an emphasis on models that are derived from our 
understanding of human behavior and cultural systems, models with explanatory 
content. 


The Value of Correlative Models 


This call for explanation and explanatory models should not be taken as 
disparaging research that focuses on empirical analysis. Description, classification, 
and inductive generalizations are basic building blocks in any science. It should be 
clear from the sheer weight of information on correlative models in this volume and 
from the material presented in the management-onented chapter (Chapter 11) that 
correlative models are informative and extremely valuable in many contexts. 


In Chapter 11 Kincaid suggests that for some applications, simply knowing 
where sites are hkely to be located relative to various environmental variables 1s 
sufficient. For large-scale planning purposes, for example, this level of knowledge 
about the distribution of archaeological resources may indeed be all that 1s needed 
for immediate purposes. But as suggested below, it may not be a wise use of 
resources to plan a research project solely to produce this level of information. 


Several of the concepts introduced by Kvamme in the model applications 
chapter (Chapter 8)—those of activity space and use intensity in particular —make clear 
a second important contribution of correlative models. If a research project requires 
information about the general nature of human use of a landscape, correlative models 
provide invaluable data. It 1s both intuitively obvious and clear from the ethno- 
graphic record, for example, that human groups employing different subsistence 
strategies make use of their environments in very different ways. Their mobility 
patterns vary enormously, and the particular resources and proportions of those 
resources used are equally variable. In the archaeological record these differences 
will be reflected as differences in the scale of redundancy in distributions of cultural 
remains, that 1s, how big an area must be inspected before patterns in the archaco- 
logical record begin to repeat. Likewise, the nature and strength of correlations 
between cultural remains and features of the environment will be strongly affected 
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by differences mn prehistonc resource selection. If we wish to monitor variability 
among human systems on the large scale, correlation models can provide a quantifi- 
able and casily displayed measure of differences and sumilantes. 


The Limitations of Correlative Models 


Despite the utility of correlative models for planning purposes and for certain 
research apphcations, thei genera! usefulness 1s lamited for several reasons. The 
first ss that no matter how carefully designed, methodologically sophisticated, and 
thoroughly tested a correlative model 1s, the end product is simply a sernes of 
statements about correlations between the occurrence of cultural remains and 
particular parameters or conjunctions of parameters of the mod-ra environment. 
Correlation does not tell us anything about causality. We do not know, and cannot 
determine from the model, why this relationship between cultural materials and 
environmental factors exists. Worse yet, from an archacological perspective, we do 
not know and cannot determine anything about the human system that created and 
deposited these cultural materials other than some very general notions about the 
distribution of their activities on the landscape. 


The second limitation grows out of the first. Because correlative models are 
designed to tell us where sites are located (relative to various environmental varia- 
bles) and not aby they are located as they are with respect to those vanables, even 
when they work exceedingly well, we do not know why they work. To the manager 
who only needs to know where sites are this may not ummediately appear to be a 
mayor limitation. But if we do not know why a model works in one particular study 
area, we will not know whether we should expect it to work in the next valley or the 
next county or im a simular but distant environment. Thus correlative models are 
not truly predictive, but consist of proyections of an observed pattern from a sample 
to the whole universe. When the focus of attention shifts to a new data universe, the 
process of proyection must begin anew. As will be discussed in the next section, this 
lack of generalizability in correlative models should make this limitation of concern 
to managers as well as to the professional archacologist. 

The third limitation arises because correlative models require measurable, 
mappable data. For this reason, they depend heavily on environmental factors to 
provide their independent variables, and because of this they are most successful 
when applied to societies whose movements, group size, and activities are highly 
regulated by aspects of their environment —generally hunters and gatherers. With 
a shift from food collection to food production, human societies enter into a different 
kind of relationship with their environment (characterized by Kohler in Chapter 4 
as one of increasing intensification). This does not mean that settlement locations of 
formative level societies cannot be modeled or that they are unresponsive to 
environmental factors. But the relationship with environmental factors 1s probably 
more indirect and is certainly more complex and interactive. Additionally, with 
increasing sedentism, social and political factors come to have an increasing impact 
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on the distribution of activitses and thus of sites, lessening the correlations with 
stnctly environmental variabies. 


Finally, because human groups with different subsstence omentations and 
different levels of technology use the landscape in very different ways, correlation 
models based on environmental wanables are difficult to build for areas that have 
been occupied over a long penod of tume. In the Amencan Southwest, for example, 
where the same area may have been used by Patcoindian, Archaxc, Pucbloan, and 
Athabaskan groups, a single correlation model of the relanonships between cultural 
resources and environmental vanabies would be of very muted waluc. In such cases, 
an entire senes of separately derived and tested models might be necessary. onc for 
cach major adaptation type. 


The Value of Explanatory Models 


The discussion above of the transition to explanation in modern American 


archacology makes clear the umportance of explanatory models to the archacologycal 


profession and suggests that explanatory models are central to whatever value 
archacology has for society as a whole. As anthropologists, we are mterested im 


human behavior, on cultural variability and sumilarity, in cultural stability and 
change, in the adaptation of humans as cultural beings to them natural and social 
environments. As social screntists, we have an obligation to add to the store of 
human knowledge about humanity, and as archacologusts we have a umque oppor- 
tunity to contribute knowledge about the long-term history of humankand, about 
adaptational successes and failures, and about the evolution of the complex social, 
political, and economac systems that order and dominate our lives. 


If the value of explanatory models to archacologists is clear, the value of these 
models to landholding agencies and to individuals involved in the field of cultural 
resource management is far less obvious. Because correlative models are relatively 
straightforward to develop and because simple environmental variables are rela- 
tively casy to measure, these models are viewed as cost-eflective and obyective. And 
im the short run they often provide the kinds of information needed. This has 
sometimes led to a perception on the part of managers that explanatory models are 
an unnecessary luxury. There are at least two reasons, however, why such models 
may, in the long run, prove to be critical to the very people who now question their 
utility of at least thei cost-eflectiveness. 


The first reason has to do with the lack of generahzability for correlative 
models that was discussed above. If we do not know why a model works mm one study 
area, we have no way of knowing whether it will work in a new study area; however 
much we may beheve or expect that u will work, we cannot éser. In order for a 
cultural resource manager to use information derived from models, even for the 
most general planning purposes, he or she must teow that the mode) works within 
specified levels of confidence and precision. With correlative models, the -efore, the 
process of model development, testing, refinement, and retesting can never be 
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short-cut: every new situation will require the development and verification of a 
new model. 

Wath explanatory models, on the other hand, eventually we can hope to be 
able to offer general models that can be demonstrated to be apphcable im any 
situation characterized by a specified set of cultural system and ecosystem variables. 
The key word here us, of course, “eventually”; as noted in the next section, 
explanatory models are extremely complex and difficult to build, and m may be a 
song while before we can be consistently successful mm doing so. But that does not 
alter the potential value to resource managers of such powerful and truly generahiz- 
able models. 


The second reason why explanatory models are potentially of great value m a 
management context has to do with the basic foundation of cultural resource 
management as t was envimoned mn the National Histone Preservation Act (NHP A). 
One of the more colorful semor members of the Amencan archacological communsty 
admonishes hus students not to lose mght of thew major research objectives and 
become bogged down m trivia by remunding them that “It's hard to remember that 
you started out to drain the swamp when you're up to your |anatomacal reference 
deleted) mm alhgators.” Cultural resource management (CRM), especially as ot os 
practiced mm large land-managing agencies, tends to have the same problem. 
Sometimes we become so bogged down m the minutia of finding sites and protect- 
ing sites and mitigating mmpacts to sites that we lose track of the reason why these 
things called “sues” have any umportance, any claim to protection wnder the law. 


A great deal of tume and energy us devoted to compliance with Section 106 of 
the NHPA, the section that mandates consideration of the umpacts of federal 
undertakings on cultural resources and avoidance or mitigation of those mmpacts 
where possible. Sometimes this attention to Section 106 causes us to lose track of the 
requirements of Section 110, which charges federal agencies with the larger task of 
locating, inventorying, and normmating to the National Register of Historic Places 
the chgible properties under their control and imstructs them to take care that these 
properties are not “madvertently transferred, sold, demolished, substantially 
altered, or allowed to deteriorate significantly.” In management terms, so much 
energy 1s going into the support program that the primary program gets shghted. 

Probably the most commonly cited criterion for claiming National Register 
eligibility for a prehistoric site is that it has “yielded, or may be likely to yreld, 
information important in prehistory or history” (36 CFR 60.4). It is thei information 
content rather than any wmtrinsic value that gives archacological sites significance and 
thus a legal nght to protection, and it is because of this information content that the 
landholding agencies have been given a mandate to manage these resources. 





It us the long-range goals of Section 110 compliance that can most benefit from 
the kind of understanding of the archaeological record that could be gained from 
explanatory models. For most archaeological sites discovered during the course of 
CRM-funded surveys, the survey recording and analysis will constitute the only 
screntific attention ever accorded to those sites. We would suggest, therefore, that 
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by calling for archa -ologica’ models that emphasize explanation rather than correla- 
on, m.enagers would maxumure thew return on realizing the information potental 
of the sates under thei jurisdiction and find themselves m a better postion to fulfill 
thes responsibilstees under Section 110. While correlanon models might eventually 
become powerful and sophistscated enough to meet some of the requirements of 
Sectson 106 comphance, explanatory models could, m the long run, come much 
closer to meeting the need of comphance with Section 110. 


The Limitations of Explanatory Models 


The imitation. of explanatory models are discussed by Altschul m Chapter 3, 
but hes evaluation o the problems can be summed up m one short sentence: 
explanatory models are extremely difficult to create and walsdate. The length of the 
method and theory «hapter (Chapter 4) and the complexsty of the arguments 
presented therem b, Ebert and Kohler make clear the difficulty of sdentifying the 
linkages and warranting the arguments m a model that ss based m anthropological 
theory. The length of the model applications chapter (Chapter 8) and the comples- 
ity of the techmques discussed by Kvamme make at clear that currently correlative 
models are far ahead of explanatory models mn methodological sophistication and 
mathematical cxpresson. 


The other serous limitation of explanatory models 1s one that 1s common to all 


attempts at explanation m archacology. It has to do with assigning meaning to what 
we find m the archacological record. In building an explanatory model we use 
information derived from the system context —often from ethnographic ot ethno- 
archacological research, but sometimes from geography, ecology, or other fields — 
to generate hypotheses about the archacological context. 1f we build these hypo- 
theses into models and test them agaist the archacological record and find that the 
results tend to confirm the model, then we assign meaning to the archacological 
remams based on our mterpretations of the systemic context. 


The danger here ws that our understanding of the system context will be 
moorrect. If we say that finding x mm the archacological record will mean that » 
happened m the systemic context, and if our ideas about » are wrong, then no 
matter what we find on the archacological record our interpretations will be flawed. 
For example, until the late 1960s most archacologusts belewed that hunters and 
gatherers lead an extremely ditlicult and precarious exsstence, teetering constantly 
on the brink of starvation and devoting every waking hour to the quest for food. 
Given such a perspective it seemed obvious that any hunter-gatherer group that 
aad the opportunity to do so would ummediately adopt agriculture, which was 
viewed as an camer and more secure way of hfe. Subsequent research demonstrated 
that hunteng and gathering is, om fact, a rather stable and secure means of making a 
lwing and that agreculture ss, m fact, both a more laborious and (im many environ- 
ments) a less secure subsistence strategy. Most of the carly archacological research 
on the orgies of agrnculture was based on these meoorrect notions about the 
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systemic context of hunting and gathering, and the results were, therefore, wrong 
or at least inadequate. 


Although this danger of being fundamentally wrong is certainly an important 
limitation of explanatory models, it is also in a sense an indication of progress. As 
long as archaeologists concentrated solely on descniption and documentation it was 
nearly impossible for them to be wrong in any but trivial matters. But when they 
took the major step of attempting true explanation, they had to accept the risk of 
being profoundly wrong along with the rewards of gaining knowledge. The same 
relationship exists between correlative and explanatory models. Although there 
may be arguments about how to test for correlation or how to measure the strength 
of a correlation or assign confidence limits to it, once those are resolved the only 
question that remains is whether a correlation exists or not. With explanatory 
models the risks of being very wrong are much higher, but the potential gains in 
knowledge are correspondingly increased. 


In the final analysis, we would suggest, a willingness to accept the risk of being 
wrong is one of the requirements of science. Scientific explanation consists of 
theories, statements about the way that we believe the world operates. An individ- 
ual scientist offers an explanation that he or she believes accounts for as much 
variability in the phenomenon under study as possible. Subsequently this scientist 
and others test this explanation against data concerning the phenomenon, and the 
explanation is refined and revised to cover yet more of the variability. Empirical 
generalizations concerning the data can serve as one source of explanatory hypoth- 
eses, but those hypotheses cannot subsequently be tested against the same data. 
And empirical generalizations based on the archaeological record can never gener- 
ate explanations of human behavior. We would argue that while correlative models 
are valuable in several contexts and explanatory models have several serious 
limitations, the ultimate goal of archaeological modeling, whether carried out for 
research purposes or to meet management needs, should be explanation. 


HISTORY OF THE BLM PREDICTIVE MODELING PROJECT 


In May 1983 a group of Bureau of Land Management (BLM) state archaeolo- 
gists and Forest Service regional archaeologists from the Rocky Mountain states 
were meeting in Salt Lake City as part of a multistate task force designing 
procedures to deal with oil and gas development on public lands. During the course 
of these meetings, a number of informal discussions took place about the potential 
and problems of predictive modeling. It soon became clear that this was a subject of 
both great interest and great concern to the task force members, and a decision was 
made to begin a group project to study the ramifications and requirements of 
predictive modeling and to coordinate modeling efforts throughout the Mountain 
West. 

As it happened, the Colorado State Office and Service Center of the BLM had 
recently initiated a predictive modeling study project, and with the support and 
cooperation of many people in the management hierarchy of the BLM, the newly 
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organized group of state and regional archaeologists was able to secure permission in 
September of 1983 to expand the scope of this already approved project to encom- 
pass an in-depth, state-of-the-art study of predictive modeling in archaeology. All 
those who had been at the task force meetings recognized that such a study was 
necessary if the problems encountered as a result of previous uses of predictive 
modeling in resource management contexts were to be avoided. This volume is the 
first product of the BLM Cultural Resource Predictive Modeling Project, but it is 
not the only product being planned. A training program and a technical assistance 
service for field personnel are planned, along with a set of demonstration models, 
which will be developed in future phases of this project. 

In their proposal to expand the predictive modeling study to make it as 
comprehensive as possible, the Project Advisory Team (PAT; that is, the BLM and 
Forest Service archaeologists) pointed out that several predictive modeling 
attempts that had recently been carried out in management contexts had been 
highly controversial and of limited utility. They went on to add that since knowl- 
edge about this topic was limited among cultural resource proiessionals—both 
within the government and outside it—the lack of standards, guidelines, and 
procedures was hindering effective and efficient use of modeling for resource 
management. 

The specific failings of past modeling efforts that they noted included failure to 
address management needs, lack of specificity, poor use of existing data, ineffective 
or biased sampling designs, inappropriate statistical analysis techniques, failure to 
collect inventory data suitable f. r the development of a predictive model, develop- 
ment of models using nonreplicable techniques, lack of comparability of and 
inappropriate use of environmental variables, lack of phasing to allow for model 
testing and refinement, and failure to use such technical aids as remote sensing and 
geographic information systems to streamline model development. 


The stated goals of the expanded predictive modeling project were 


1. to evaluate trends in the development of predictive modeling critically, 
using knowledge gained through past research; 


2. to explore the feasibility and practicality of predictive modeling for 
meeting management objectives; 

3. to analyze and define the components of the model-building process, 
particularly with respect to cultural resource management; 


4. to develop a set of standards for the archaeological and environmental 
data to be used in modeling efforts; and 


5. to provide BLM field offices with information on data collection for 
modeling purposes and statistical manipulations of those data. 


The most important step in meeting these goals would be to contract with a 
team of outside consultants —archaeologists with national reputations in the field of 
predictive modeling—to produce a comprehensive, publishable report on this 


10 








PREDICTING THE PAST 


topic. In addition, this proyect would have considerable input from BLM personnel, 
from a volunteer advisory group consisting of archaeologists for other federal 
agencies and individuals from State Historic Preservation Offices and the National 
Advisory Council on Historic Preservation, and from the professional archaeological 
community, including pzivate contractors, representatives of professional organiza- 
tions, and personnel from universities and museums. These individuals are named 


in the Acknowlecgments at the front of this book. 


To ensure that the profession at large ~ ould have the opportunity for a high 
level of input, several steps were taken. Once the expansion of the predictive 
modeling project had been approved, the PAT met at the Nevada State Office in 
Reno to determine how to organize and implement the project. As part of this 
meeting, the PAT met with representatives of the Society for American Archaeol- 
ogy (SAA) in an effort to secure society input and support for this project from its 
inception. The project team also corresponded with the society’s president and 
executive committee, outlining the goals of the project and requesting suggestions 
for potential contractors and comments on the initial chapter outlines for the 
proposed book. In addition, members of the team met with regional representatives 
of the SAA to discuss the project and secure input, and the Procurement and 
Personnel Committee of the PAT held an open meeting for potential contractors 
and other interested persons at the 1984 annual meetings of the SAA in Portland, 
Oregon. 


From the beginning of the project the BLM’s Washington office provided 
normal intra-agency coordination among Washington, D.C., agencies. The PAT 
provided project briefings in Washington for top-level management and for senior- 
level agency archaeological program heads. Useful project d::ection was offered by 
these individuals, and most agreed to organize and provide a formal review of the 
initial draft document by their respective agencies. 


Preliminary chapter outlines for the proposed predictive modeling book were 
prepared at the November 1983 meeting in Reno. Once these had been reviewed by 
the various advisory groups, final outlines were prepared, and requests for proposals 
were sent to potential contractors suggested by the various advisory groups and by 
members of the PAT. Those who wished to bid on one or more chapters responded 
with proposals that included detailed revised outlines for the chapters of interest. 
Successful bidders were selected on the basis of separate cost and technical propos- 
als, with technical merit being more important than price. Since quality perform- 
ance was considered vital to a successful project, the government reserved the nght 
to award a contract on other than the lowest-price basis if a higher-priced proposal 
was rated higher in quality. The revised outlines submitted by the successful 
bidders were once again circulated to the advisory groups for comment, and then in 
August of 1984 the entire book-production team—authors, editors, and PAT —met 
in Denver for a prework conference. 
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THE PRODUCTION OF THIS VOLUME 


At the prework meeting the authors and editors were given a cram course on 
the history and goals of this project, and then we attempted, in the course of several 
strenuous but exhilarating days, to give structure and coherence to this exercise in 
authorship by committee. We dealt ruthlessly with redundancies, struggled with 
what proved to be an insurmountable dichotomy among the authors in their view of 
the very nature of predictive modeling, and shifted the content and order of the 
chapters so many times that everyone (except the technical editor, who was 
keeping score) lost track of the “new” order by the third day. 


One of the most difficult tasks of those days in Denver was to get a group of 
largely academic- and contract-oriented archaeologists to think in terms of man- 
agement issues. Indeed, the very phrase ““management concerns” produced mock 
groans by the end of the first day. We did gradually become more aware of the whole 
gamut of problems implied in the concept of management concerns, but it also 
became apparent to everyone that in writing and editing this book we could only do 
what we knew best —produce a book about predictive modeling; the real grappling 
with management concerns would have to be done by those who understood them 
best —the federal archaeologists of the PAT. At that point Dan Martin and Chris 
Kincaid, charter members of the PAT, agreed reluctantly to write the management 
issues chapter of the book with heavy input from the other team members; 
subsequently Burt Williams bowed to similar pressure and “volunteered” to be a 
coauthor on this chapter. By the end of the Denver meeting we had developed a 
final outline for the book and for each of the chapters, and the authors’ difficulties 


began. 


Between August of 1984 and January of 1985 most of the material in Chapters 
2-10 of this book was written—an impressive feat given that all of the authors had 
simultaneous major commitments to teaching or to other contracts and writing 
responsibilities. In February of 1985, after we had a chance to at least skim most of 
the manuscripts, the editors and the PAT met to discuss the “‘product” and to 
make various editorial decisions. It was again clear that the main body of this book 
was not as management-oriented as the team members had hoped, but it was also 
clear that the manuscripts before us were the raw material of an invaluable resource 
volume—containing comprehensive, up-to-date treatments of the theoretical, 
methodological, and technical issues facing those who attempt to do archaeological 
predictive modeling. And again, this meant that the burden of meeting “manage- 
ment concerns” was going to lie wholly on the PAT members who were writing the 
management issues chapter. After this meeting, the editors’ difficulties began. 


In a slow, collaborative process between editors and authors (and taking the 
written comments of the PAT closely into account) we gradually shaped the 
individual manuscripts into the chapters of a generally unified book. As noted 
below, we made no effort to impose an artificial consistency of viewpoint on these 
authors. Archaeological predictive modeling is a field in which no consensus has 
emerged: that is one of the main points that is demonstrated in this book. When 
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the authors and editors had reached agreement on draft chapters, the book was sent 
out for a detailed and extensive peer review in October of 1985. 


The reviewing agencies and organizations are also listed in the Actnowledgments. 
The review comments were compiled by the PAT and the voiume editors, who 
carefully considered all comments and then summamized them by areas of concern. 
Minor questions or comments were handled by the editors; more substantial 
comments were forwarded to the authors, who responded in whatever way seemed 
appropriate and incorporated changes based on points raised by the reviewers into 
their various chapters. The results of the review are discussed in Chapter 12. 


It was at this point that we hit the only major snag in the whole process of 
producing this volume. The Washington office of the BLM was not satisfied with 
the management concerns chapter and did not release it for review along with the 
rest of the book. Through a very long process of discussions between the PAT and 
the Washington office, it eventually became clear that Chapter || would have to be 
completely rewritten. Chris Kincaid once again accepted this task, and in 1988 she 
produced a draft of the chapter as it appears in this book. Chapter 11 and Chapter 
12, the summary by Judge and Martin, were sent out for comment to a smaller 
corpus of reviewers selected from the large number of people who reviewed 
Chapters 1-10. 


We have included this detailed discussion of the history of the BLM predictive 
modeling project and of this book because we, as editors, teel that this volume 
represents the culmination of a remarkable cooperative effort —something that 2 
can say because the credit tor those noteworthy aspects of this proyect hes with 
others. The determination and far-sightedness of the PAT members who conceived 
the notion of a large-scale, comprehensive, and high-quality effort and then guided, 
coaxed, and coerced the project into becoming a reality were certainly remarkable 
and commendable. Special merit accrues to Dan Martin and Chris Kincaid, who 
kept the project going during the long Chapter 11 delay and who wrote and rewrote 
the new Chapters 11 and 12 to solve the problems. 


Finally, this book represents a remarkable degree of involvement and coopera- 
tion on the part of many people from all sectors of the archaeological profession. 
This has certainly contributed substantially to the quality of the book, but equally 
important, this level of cooperation seems to us to indicate that the often decried 
isolationism of academic, federal, and contract archaeologists may, lke reports of 
Mark Twain's demise, have been greatly exaggerated. 


THE STRUCTURE OF THIS BOOK 


General Orientation 


It will probably be helpful to the general reader to know four things about the 
overall orientation of this book at the outset. The first of these is that this us a book 
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about modeling in the context of prehistoric archaeology. While many of the 
principles suggested and techniques used would undoubtedly be of use to archaeol- 
ogists studying classical and historical societies, particular problems and concerns of 
those scholars and techniques that would be especially helpful to them are touched 
on only in passing in this book. This onentation is a reflection of the background 
and experience of the authors and editors, and it 1s also a result of most of the extant 
predictive modeling studies having been concerned with prehistonc cultural 
remains. 


The second thing, while we are on the subject of the intended audience for this 
volume, 1s that we have tned to maintain a balance between materials that would be 
of most interest to landholding agency managers and federal and state archacolo- 
gists and material that would be of interest to the archaeological profession in 
general. Certain chapters, such as the method and theory discussion by Ebert and 
Kohler in Chapter 4, will certainly be of greatest interest to professional archacolo- 
gists, while other chapters, such as the management perspectives chapter by 
Kincaid (Chapter 11) will be of greatest interest to managers. Sull other chapters, 
such as the statistics discussion by Altschul and Rose (Chapter 5), will probably be 
viewed by readers of both persuasions as a resource document to be consulted as 
needed. The result of this effort to balance the book among somewhat disparate 
audiences 1s that nearly all readers will find some parts of the book more interesting 
than others. We have attempted, through our discussion below of the subjects 
covered in each chapter, through frequent cross-referencing, and through the 
production of a relatively detailed index, to enable the reader to identify quickly 
those subjects and discussions that are likely to be of interest to him or her. 


The third thing to be noted is that even though some of the volume authors 
are strongly committed to the necessity for constructing explanatory models with 
mayor deductively derived components (see especially Ebert and Kohler in Chapter 
4), by far the largest part of the book consists of information on correlative models 
derived largely or wholly through inductive means. These conflicting conceptions 
of the proper nature and direction of predictive modeling in archaeology are clear 
throughout the book; there was some discussion about the advisability of attempt- 
ing to impose an editorial “synthesis” on the two camps of authors to create a 
theoretically and methodologically unified book, but we felt that this was artificial 
and premature. The division that 1s apparent in this book between those who are 
building sophisticated and fascinating correlative models and those who insist that 
archaeology 1s explanation or it 1s nothing is a reflection of the state of predictive 
modeling in American archaeology today. We felt that if this book was to be a fair 
summary of “the state of the art,”’ the unresolved theoretical conflicts as well as the 
exciting technological and methodological advances should be explored. We have 
offered our own ideas on explanation and correlation in archaeological models in a 
previous section of this chapter, but we tried not to impose those ideas on the 
authors during the editing process. 


The final point that we should raise about the general orientation of this book 
is that it 1s heavily biased toward models for hunter-gatherer societies. It was not 
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planned that way, and we tned to decrease this bias after the first draft of the book 
was finished. But we found that it was not that simple. In large part this emphasis on 
hunter-gatherers 1s a reflection of the emphasis on correlative models. As was noted 
in the discussion of the limitations of correlative models above, these models are 
most successtul when apphed to someties with a food-collecting subsistence base 
and relatively simple and fluid forms of sociai and demographic organization. In 
addition, this emphasis on hunter-gatherers seems to be a result of the interests of 
many of the researchers carrying out archacological modeling projects today, so in 
this way the book 1s again a reflection of current developments im the field. We see 
this lack of modeling interest in middle-level or formative societies as well as 
historical societies as unfortunate, however, and would like to think that an 
increased interest in this topic will be one of the trends in future modeling projects. 


A Preview of Coming Attractions 


The main body of this book contains information that can roughly be divided 
into four topics. Chapters 2 through 4 present general discussions related to the 
modeling process. In Chapter 2 Kohler first reviews the intellectual history of what 
we today call predictive modeling, tracing the changing views of the relationship 
between human societies and their environment through time. He then discusses 
the contributions of the culture ecologists and especially that of Julian Steward to 
our thinking about this relationship. Finally, he describes the growing interest in 
predictive modeling in recent years and suggests a set of general criteria for 
evaluating models—generalizability, simplicity, internal consistency, precision, 
and falsifiability —using a group of example modeling projects to illustrate these 
concepts. 


In Chapter 3 Altschul discusses models in general and the process of modeling. 
He suggests a typology of predictive models based on the spatial referent of the 
model and provides archacological examples of the various types. He also discusses 
the methodological pittalls of the various types of models and their strengths and 
weaknesses. Finally, he provides an overview of the model-building process, touch- 
ing on data collection, synthesis, and evaluation; selection of independent and 
dependent variables; and model testing and refinement. All of these topics are 
addressed in detail in Chapters 5 through 8. 


Chapter 4, by Ebert and Kohler, deals not with modeling as such, but with the 
theoretical and methodological considerations that must underlie all modeling 
efforts if the resultant models are to be faithful replications of human systems. 
Although the material presented 1s sometimes difficult, the concepts under discus- 
sion are, in the long run, just as critical to the success of modeling efforts as are 
questions of data collection or statistical manipulation. The authors discuss the 
organization of human systems and the implication of various organizational princi- 
ples for the nature of the archaeological record produced. They also consider the 
relationship between human systems and the ecological systems of which they are a 
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part. Finally, they discuss the archacological record self —the way « 1s formed and 
the processes that affect « after the cultural matenals are deposited —and offer 
suggestions about the umplhications of these formation and transformation processes 
for archacology in general and for predictive modeling mm particular. 

Chapters 5 through 8 cover the details of the modeling process presented im 
overview in Chapter 3. In Chapter 5 Altschul and Rose discuss statistical approaches 
to modeling, particularly the theoretical and methodological considerations that 
must be taken into account m the course of building quantitative models. This 
chapter 1s not a cook book of statistical techmiques, but rather presents information 
on the general types of quantitative models. They discuss techniques of prediction 
and classification, emphasizing the strengths, limitations, and underlying assump- 
tuons of each, and describe vanous procedures for verifying the resultant models and 
generalizing from them. 


Chapter 6, by Altschul and Nagle, covers the strategies and techniques 
involved in collecting new data for use in model development. The important and 
complex topic of sampling and the attendant problems of unit size and shape, 
sample size and means of selection, and techniques of parameter estimation are 
covered in detail. The authors also present a valuable discussion of the particular 
problems that arise when data must be collected within the constraints of cultural 
resource management surveys, where the survey universe and often the survey 
intensity are prescribed on the basis of considerations that have nothing to do with 
modeling requirements or research needs. Finally, they discuss various considera- 
tions of data recording, especially those imposed by “no collection” surveys. 


In Chapter 7 Kvamme discusses the use of already collected data tor model 
development, a topic of considerable importance given the quantity of existing data 
and the cost of data collection. As the author points out, the mayor problem with 
using existing data ts that they very often are biased, and usually the type or types 
ot biases present in the data base are unknown. He discusses the most common 
types of bias and suggests the effects that such biases will have on models developed 
using these data. He then offers a series of procedures for reducing deticiencies and 
minimizing the effects of biases. Finally, he describes ways of evaluating models 
built with existing data and means of determining what additional data must be 
collected in order to create a satisfactory model. 


In Chapter 8 Kvamme goes on to discuss the actual steps in model building, 
beginning with the selection of variables and describing in detail various quantita- 
tive techniques for pattern recognition and assessment. He then considers the 
difficult problem of assessing model performance, discussing various means for 
measuring accuracy rates and assigning confidence limits to model results and 
providing a comparative analysis of several kinds of quantitative models. 


Chapters 9 and 10 present information on types of technical aids that are 
available to assist researchers in the development of predictive models. In Chapter 9 
Ebert summarizes the field of remote sensing, describing the devices used, the 
kinds of data that can be derived, and the types of analytical procedures commonly 
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apphed to them. He then discusses the general potential of remote sensor data for 


predictive modeling applications and describes and evaluates several archaeological 
modeling projects that have involved the use of such data. 


In Chapter 10 Kvamme and Kohler discuss a very exciting and relatively new 
technological aid, the Geographic Information System (GIS). A GIS comprises a set 
of computer programs, the hardware on which the programs run, and a spatially 
organized data base. In a GIS, data are derived from maps and similar sources of 
information on spatial relationships, and these data are stored not sequentially, as 
they are in most data base management applications, but in a form that retains the 
organizational information of the orginal data as well as the actual values of the 
variables. The applications of GIS discussed by Kvamme and Kohler make it clear 
that the potential of these systems for aiding in the predictive modeling process is 
enormous. 


Finally, Chapter |! is concerned with the federal management perspective on 
archaeological predictive modeling. The chapter is organized around a series of 
commonly asked questions, ¢.g., ““What kinds of models are there? When do we use 
which type?” Kincaid summarizes relevant conclusions reached by the vanous 
authors and describes the potential usefulness of models for such central tasks of 
CRM as inventory, evaluation, resource protection, and planning. 


In Chapter 12 Judge and Martin offer an appraisal both of the relative success or 
failure of the project in meeting the goals set for it originally and of the massive 
review process to which the draft manuscript was subjected. They then suggest 
several major issues raised in the course of this volume that they feel should be 
central questions in future modeling efforts. 


The final section of this volume is an appendix compiled by Thoms, which 
presents an annotated review and assessment of a number of important and 
representative archaeological predictive modeling projects that have been carried 
out in recent years. The purpose of this appendix is to provide additional informa- 
tion on the kinds of projects that have been done, on the types of data that have 
been generated, and on the successes and pitfalls of such projects in the past. 


We hope that this book will become a major reference volume for the archaeo- 
logical profession as a whole as well as filling its original role in providing compre- 
hensive, up-to-date information on topics related to predictive modeling for federal 
archaeologists and land-use managers. We feel that the blend of information offered 
here on modeling concepts, mathematical and statistical techniques, technical aids 
(such as remote sensing and GIS), and concerns about the relationship between 
modeling and archaeological method and theory will go along way toward meeting 
the needs of researchers who are interested in this form of data analysis and 
interpretation and who wish to construct informed, sophisticated models. 





17 





SEBASTIAN AND JUDGE 





in addstson to the more general thanks expressed in the volume acknowledgments, we would like 
especially to thank two people. Furst we wish to express our appreciation to Dan Martin, whose gentle 
persistence, unfailing good humor, and determination kept the project afloat through delays, disas- 
ters, and aggravations and made thes publication a realty. But most of all, we want to thank june-cl 
Piper, the techmucal editor of this volume, who typed, edaed, formatted, printed, read proofs, helped 
with indexing, cajoled authors, soothes savage beasts, and kept track of enough details to cause 
brown-out m lesser mortals. 


REFERENCE CITED 


Clarke, Dawid L. 
1968 Analytual Archarology. Methuen, London. 





Chapter 2 


PREDICTIVE LOCATIONAL MODELING: 
HISTORY AND CURRENT PRACTICE 


Timothy A. Kohler 


In a volume primarily devoted to predicting locations of archacologycal mate- 
nals on the basis of factors in the natural environment, it seems important to spend a 


little tume examining the anthropological underpinnings for such endeavors. In the 
first part of this chapter, relevant portions of the history of anthropological thought 
up to the 1940s are reviewed briefly and the contributions of Julian Steward are 
discussed in greater detail. Steward’s work is emphasized in this historical section 
because, | will argue, most proponents of predictive locational modeling adopt — 
though not always consciously —both a cultural ecological position on the nature of 
culture and the cultural ecological causal approach to understanding. 


In the second mayor division of this chapter the development of archacological 
settlement pattern studies is discussed as it relates to these developments im theory; 
many settlement pattern studies differ from predictive locational models only in 
their lack of explicit extrapolation to a spatial population. This specialized discus- 
sion does not attempt to summarize the entire history of settlement pattern studies; 
see Parsons (1972) or Ammerman (1981) for a more comprehensive review. 


Finally, the potential uses of predictive locational models from both manage- 
ment and research perspectives are set forth, followed by a few examples from the 
literature. These examples are meant to illustrate the diversity of approaches 
currently in use and some of the most obvious issues that these approaches raise. 
The reader interested in additional exa.nples of recent locational models is referred 
to Kohler and Parker (1986) and to the appendix of this volume. 


An important premise of this chapter is that predictive modeling as it 1s 
presently practiced is fundamental'y about environmental determinism. That is 
why, im the next section, we briefly recapitulate the increasingly sophisticated 
forms this paradigm has taken. Why are the social, political, and even cognitive 
religious factors that virtually all archaeologists recognize as factors affecting site 
location and function usually ignored in predictive modeling? 


One obvious reason is that most models are constructed inferentially, starting 
from a sample of archacological sites in a region and generalizing to an unknown 
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population of sites m that same region. This is made possible by resorting to maps 
displaying environmental categones across the total region with which site loca- 
tons have been empuncally correlated m the sample. At the same time, a total 
mapping of sites (the remains of the social and politscal network) is not available, or 
a predictive model would not be necessary. 

Altschul is clearly correct when he says, in the next chapter of this volume, 
that “magnet sites” may significantly affect settlement density in them neighbor- 
hoods, presumably for reasons that go far beyond factors of the physical and biotic 
environments. In his example, the density of settlements around major Hohokam 
sites in the Santa Cruz River V alley of southern Arizona was greater than predicted 
on the basis of environmental features. And yet, it 1s possible to find examples im the 
archacological record where precisely the opposite effect has been documented. In 
some penods of its history, for example, Teotihuacan m the Basin of Mexico seems 
to create a vacuum around itself; in others, sites seem to be denser in its vicinity 
than elsewhere (maps associated with Sanders et al. 1979). To turther comphcate 
matters, such changes may be duc in part to changes m the area's role in a much 
larger, supta-regional system (see Paynter 1982:x1) that may be poorly understood. 
On a smaller, sumplcr scale, the large Pueblo | site of Grass Mesa in the Dolores River 
Valley of southwestern Colorado also seems to have created a partial settiement 
vacuum in its vicenity during the peak of its occupation (Kohler 1986-37). 

This brings us to a second reason why nonenvironmental variables have not 
been used in most predictive locational models: archacoiogists simply don’t know 
bow to use them. It is reasonable to believe that our sister disciplines, such as 
geography, might have solved such problems, particularly for the non-hunter- 
gatherer societies that they have emphasized. This is not the place for an exhaustive 
review of geography, but it is worth mentioning two approaches commonly used in 
the geographic literature to see whether they might help us. 

One such approach with deep roots is the well-known central place theory, 
conceived by Von Thunen in 1826, expanded by Christaller mn 1933, and introduced 
to the English-speaking world by Ullman in his famous article, “A Theory of 
Location for Cities” (1941; in Boyce 1980). Among other things, the theory predicts 
that cities will arise in the centers of productive areas; that they will be larger as 
their tributary areas become larger; that when a region 1s packed with cities the 
“tributary” spaces will be best described as hexagons; and most important, that a 
hierarchy of city size occurs, with centers in each class being predictable in number 
and 1n distance from each other. Ullman noted, as have many others, that the actual 
location of centers may be distorted by the distributions of resources and transpor- 
tation routes, and that 


the type of scheme prevsiling wm vanous regions « susceptible to many influences. 
Productivity of the soil, type of agriculture and intensity of cultewatron, topegraphy, 
governmental organization, are all obvious modifiers... 

The system of central places us mot statec or fixed; rather a os subject te change and 
development with changing conditions... . 
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Chnstalier may be guilty of clarnmg too great an applcatson of bus scheme. Hes crmera 
tor determunmg typacal-sure settiements and thes normal umber apparently do not i 
actual frequency counts of settlements =m many almos undorm regsons |m Boyce 
1980: 176-177) 


Given the subtilenes and especially the fuadity of the sociopolitical environ- 
ment, ss st any wonder that archacologists have chosen to concentrate on those 
relatively stable, “distorting” factors of the natural environment for locational 
prediction? In general, the central place model appears to be more valuable for 
analy: of a total spatial pattern of contemporancous settlements than m 1s for 
predsction of the total distribution from some small subset of . Nevertheless, a does 
have potential within a predictive context of enough of the settlement system 1s 
known to enable discernment of levels of size-class hserarchy, typical spacing of 
settlements within lewels, and degree of influence of the various environmental 
factors serving to distort the ideal pattern. (For more discusmon of central place 
modeling see Haggett ct al. 1977 and warsous articles mn Smith 1976.) 


Another possibly relevant line of inquiry in geography 1s the study of industrial 
location. Let us assume for a moment that there are enough sumularities between the 
problem of minimizing transport costs m the placement of factornes and in the 
placement of relatively stable resdential locations to make such an analogy worth- 
while. Economic assumptions have thoroughly permeated this field so that, at least 
until very recently, profit maximization, m the context of perfect and complete 
information and thorough predictability of future circumstances, has been the 
single goal guiding analysis. In Aifred Weber's “least-cost™ model (1929), the goal 
was to minimize transportation costs per unit of production, although benefits of 
agglomeration and labor availability might slightly distort the location predicted to 
be sdeal on this basis (Gold 1980:217-231). 

Later refinements of Weber's approach concentrated on correcting overly 
simplistic assumptions about transport costs, market demand, and methodological 
factors, and it was not until the 1970s that analysts began to question its reliance on 
economu factors im general and distance costs (as opposed to other costs) im 
partecular (Gold 1980:218). Now, to pudge by Gold's recent review of this area, 
interest centers on questions that were previously ignored, including how decisions 
are made in industrnal organizations; how the wider industrial, busness, and 
sociopolitical environments affect locational decisions; the extent to which attitudes 
about regions predispose locational behavior; and how locational searches are 
actually conducted. Gold concludes that such research is at an “exploratory stage” 
but that previous (exclusively economic) theory put forward “a model of behavior 
which, by its inherent assumptions, says little about the processes by which 
real-world locational decisions are reached” (Gold 1980:230-231). At this point it 
appears that archacologists can profit from reading this literature but will not be 
able to find here a working, realistic model that will solve their own problems. 

While archaeologists must redouble their efforts to build workable models 


with predictive power that take into account how social and political variables as 
well as those of the marrow economic environment affect location, and while 
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questions as to how locational decssons are reached m relatively small scale someties 
need more attention from archacologusts, the field as u presently exists—not as 
perhaps should be —ss the subject of the remamder of this chapter. 


THE WIDER HISTORICAL DEBATE: HOW AND TO WHAT 
EXTENT DOES THE NATURAL ENVIRONMENT INFLUENCE 
HUMAN BEHAVIOR? 


Attempts to explain differences among human societies are as old as the 
recognition of that diversity. The role of environmental factors m creating this 
diversity has been a subyect of inquiry and debate since antiquity. In classical tomes 
these inquines ranged from abstract questions about the ongin of the earth and of 
humankind to the search for 


tatronal explanations tor the cumtence of both health and ducase, cuplansiions which 
called tor conuderatson, among other tacters, of the nature and durectson of wunds, the 
efiects of swamps and damp places, the relateon of sunhght and of the sun's posstson mm 
the heavem to the proper steng of howses and willages, and whuch, by catenuon, 
encompaned mwestygation of the effects of “ars, waters, and places” on national 
character |Clacken 1067-7 -4) 


An carly example of thes perspective us the Hater: of Herodotus. Written m the 
fifth century BC and primarily concerned with the struggle of the Greeks to free 
themselves from Persan influence, the Hareri: also provides sketches of some 50 
societies with attention to thew geographic location, environment, dress, food, 
dwellings, form of self defense, and prestige as pudges among other peoples (Hodgen 
1964.23). 

Yet om the Mediterranean world tollowmg the collapse of the Roman Empire, 
this comparative, cross-cultural tradsteon of mquiry that included environmental 
factors within its scope lost ground to theological mterpretations of cultural diver- 
sity. Diffusion of the orgunal Adamic culture, as outlined in the first chapters of 
Gearws, tollowed by local degeneration was generally conudered to be sufficient 
explanation tor diversity through the fifteenth and sixteenth centunes (Hodgen 
1964:254 294). 


A prominent dissenter was Jean Bodin, a French jurist writing towards the end 
of the sixteenth century, who argued that 


a sownd solunen te the problem of cultural dwerwtication was not to be clouded by 
cemtroversy over the carly peopling of the world, or by a theory of ongmal cn, emgration, 
ot che breakdown of tradition among the bearers of the Adama tradmon. Leaveng all of 
thus to one sade, he elected to take man as a grven, concemtratmmy on the relation of several 
cultures to land, to chenate, and to the tapographocal feat ures of the several geographucal 
regroms 








PREDICTIVE MODELING: HISTORY AND PRACTICE 


The physical constitution of men, or their hursoral makeup, determined thew moral 
aptitudes or dispositions. Environment, climate, the conditions of tume and place, did all 
the rest, reacting on men through thei bodies | Hodgen 1964-276, 278]. 


For example, Bodin characterized people from hot climates in the northern hemi- 
sphere as being small of stature, weak, dark-haired and dark-skinned, fearful of 
heat, sad, hardy, mutinous, solitary, sober, and philosophic. People from cold 
regions were supposed to exhibit the opposite qualities (Hodgen 1964:279-280). 


Grand schemes seeking to establish causal connections on ethnic, regional, or 
even continental scales between environmental factors (especially climate) and a 
wide variety of racial and cultural characteristics became more prominent in 
eighteenth-century Enlightenment thinking. Even the Baron de Montesquieu, 
although he was particularly prone to considering the form of government as the 
factor affecting all other aspects of society, did not ignore the influences of climate 
and environment. He was also willing to accord different factors causal primacy 
among different societies: 


Nature and climate rule almost alone among the savages [people with no nonlocal 
political structures and no domesticated plants or animals}; customs govern the Chinese; 
the laws tyrannize in Japan; morals had formerly all their influence mm Sparta; and the 
ancient simplicity of manners once prevailed at Rome |Evans-Pritchard 1981:7]. 


We may conclude that even in the humanistic, rationalistic eighteenth century 
some natural philosophers took the position that, at least for some societies, causal 
initiative was to be found in the natural environment rather than in the mind. It was 
in reaction to such views that towards the end of the eighteenth century John 
Adams was led to complain, 


The world has been too long abused with notions that climate and soil decide the 
characters and political institutions of nations. The laws of Solon and the despotism of 
Mahomet have, at different times, prevailed at Athens; consuls, emperors, and pontifls 
have ruled at Rome. Can there be desired a stronger proof, that policy and education are 
able to trauumph over every disadvantage of climate? [Glacken 1967:685]. 


Montesquieu in particular, and to a lesser extent some of his contemporaries, clearly 
saw the interrelationship and interdependency among all aspects of a society 
(Evans-Pritchard 1981:4), thus laying the foundations for a functional view of 
culture that is one of the building blocks for modern cultural ecology. Although 
sweeping generalizations establishing connections directly from climate to human 
personality sound remarkably odd today, they represent unsophisticated precur- 
sors to modern cultural ecological positions that differ mainly by invoking a more 
credible and restrained chain of causation. 


On the whole, however, eighteenth-century environmental or geographic 
determinism was a minor thread in a fabric that 
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stressed the factor of conscious rational choice as the key to explanation of sociocultural 
differences. . . . [Enlightenment theoreticians} could not see a superorganic system 
imteracting with the natural environment and responding with adaptive evolutionary 
transformations, which were neither comprehended nor consciously selected by the 
individual members of the society |Harns 1968-51}. 


A necessary prerequisite to the techno-environmental perspective as espoused 
by Harris was a credible theory of evolution, supplied for biology in the mid- 
nineteenth century by Darwin, even as Spencer was elaborating a similar theory for 
sociocultural evolution—a theory already expressed in part in the earlier writings of 
Turgot, D’Holbach, and others (Harris 1968:123). The goal of the great anthropolo- 
gists over the last half of the nineteenth century (Spencer, Tylor, and Morgan) was 
to develop cultural evolutionary sequences using data from archaeology and from 
contemporary primitive societies. Their comparative method used modern “survi- 
vals” of earlier forms, not necessarily as exact replicas of stages through which other 
groups had progressed but as models from which something could be learned about 
earlier adaptations. 


In geography at this time the focus of interest continued to be on the sort of 
geographic determinism espoused by Jean Bodin and reflected in some of the 
writings of Montesquieu. This view is strongly expressed in the writing of the 
nineteenth-century German geographer Friedrich Ratzel (1896-1898). Ellen Sem- 
ple, who helped to interpret the ideas of Ratzel to the English-speaking world in the 
early 1900s and who is often regarded as an extreme geographical determinist, wrote 
of the effects of environment and climate on human stature, musculature, pigmen- 
tation, vocabulary, economy, population density, and migration, as well as of the 
“physical effects of geographic environment” (Semple 1911:40). 


Interesting counterpoints to such views also appeared in the nineteenth 
century, however. The reciprocal nature of the relationship between people and 
their environments—ignored in simple environmental or geographic 
determinism— was beginning to be appreciated in some quarters. George Marsh, in 
Man and Nature, or Physical Geography as Modified by Human Action (1864), reasoned that 
many important influences emanate not from nature to humans but rather in the 


opposite direction. 


In the early years of the twentieth century, historical particularism, most 
purely exemplified by Franz Boas, constituted a rebellion against the largely 
unilinear cultural evolutionary sequences of the nineteenth century and against the 
comparative method used by Morgan, Spencer, and others. Nor did this new school 
of anthropological thought have any use for the simple, mechanical, large-scale 
correlations among environmental features, race, and culture that were still being 
promulgated by some geographers. One of Boas’s most prominent students evalu- 
ated the causes behind the historical particularists’ avoidance of environmental 
factors in the discussion of cultural phenomena: 
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In part this represents a healthy reaction against the old aaive view that culture could be 
“explained” or derived from the environment. For the rest, is the result of a 


sharpening of specific anthropological method and the consequent clearer perception of 
culture forms, patterns, and processes as such: the recognition of the umportance of 


diffusion, for instance, and the nature of the 2ssociation of culture elements in “‘com- 
plexes.” Most attention came to be paid, accordingly, to those parts of culture which 
readily show self-sufficient forms: ceremonial, social organization, art, mythology; 
somewhat less to technology and matenal culture; still less to economics and politics, and 
problems of subsistence. Much of the anthropology practiced in this country in the 
present century has been virtually a sociology of native American culture; strictly 
historic and geographic interests have receded into the background, except where 
archaeological preoccupation kept them alive [Kroeber 1939-3]. 


lrorucally, in his ethnographies Boas remarked on environmental factors influencing 
site location, as in his astute observation that the distribution of population among 
the Central Eskimo was strongly related to conditions of sea-ice favorable to 
hunting the ringed seal (Damas 1969:1). In his later, more general work, however, he 
downplayed the role of the environment as a determinant of human behavior. 


Another ironic feature of the impact of historical particularism on anthropol- 
ogy is that it showed the way for a more productive analysis of the relationship 
between culture and environment. By reducing the scale of his observation—by 
being a particularist—Boas in some ways anticipated a more modern approach to 
the problem of correlating settlement practices with environmental features. In his 
discussion of the roots of ecological explanation in anthropology, Ellen (1982:5-6) 
makes the important point that 





The problem of drawing correlations between environmental and social phenomena 1s 
very much a question of magnitude—the geographic (or demographic) scale of the 
correlations postulated. . .. The more specific the correlation the greater the possibility 
of there being a single determining relationship and the greater the accuracy in predict- 
ing future events under specified conditions. 


This is a crucial observation for the task of locational modeling. Many valid 
criticisms can be made of naive environmental determinism for its suggestions of 
large-scale, simplistic correlations between environmental and cultural features. 
These criticisms are not ali germane, however, to more specific correlations 
between certain environmental features and certain aspects of human behavior. 
Settlement systems and ecosystems are both complex, and we should not expect to 
find simple correlations between them. The task of locational modeling is to isolate 
those aspects of the environment that do influence settlement behavior and place 
them into perspective with nonenvironmental factors that also influence settlement 
behavior. 


In the generation of anthropologists following the period in which historical 
particularism reached its ascendancy, people like Kroeber and, to a lesser extent, 
Wissler (e.g., Wissler 1922) once again began to study the relationship between 
environment and culture. This time, however, the relationship was stripped of 
causality. Both Kroeber and Wissler were interested in culture areas that were 
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relativistically defined in terms of thew distinguishing characteristics and occurred m 
different environmental settings. . . . | T}he concept of adaptation of the cultures, 
especially of the nature of social groups . . ., | was not | taken ito account. In fact, this 
would smack of reductionism, which Kroeber, holding firmly to the idea that cultures 
should be dealt with on the superorgamic level alone, had always opposed [Steward 
1973-53-54). 


Julian Steward and Cultural Ecology 


The contributions of one of Kroeber’s students ultimately have had more 
impact on archaeology than those of Kroeber himself. Along with a number of 
influential contemporaries that included Omer Stewart and Leslie White (Stewart 
1943; White 1949), Julian Steward (1938, 1955) was responsible for three advances in 
the discussion of environmental concepts that have specific importance for the 
practice of locational modeling. First, Steward, unlike anthropologists using the 
culture-area concept, was interested in causal explanation rather than correlation; 
second, he emphasized the effect of particular /ocal aspects of the environment on 
particular facets of culture, thus moving away from large-scale correlations of 
regional environments with “culture types”; third, he identified more or less 
specific pathways through which environments might influence cultures (in his 
“culture core” concept) and tried to devise a procedure for studying the extent of 
these influences (Ellen 1982:52-53). 


In Steward’sterms,those asp — of aculture that were most closely connected 
with environmental exploitation constituted the “culture core”; other aspects, 
determined by purely cultural historical factors, were considered secondary fea- 
tures. Core features and secondary features had to be identified empirically, and 
these could be expected to differ in differing environments and cultures. For a 
particular culture, discrimination between core and secondary features began with 
an examination of the natural environment and of the relations between the 
environment and the economy. Next, the patterns of behavior involved in exploit- 
ing this environment with a specific technology were recognized. Finally, the 
influence of these behavior patterns on other aspects of culture was assessed 
(Steward 1938:2; 1955:37, 40). All aspects of culture implicated in these investiga- 
tions constituted the core; the residua were the secondary features. This procedure 
clearly reveals the direction and type of causality that Steward believed to be at 
work in the relationship between environment and culture. 


Not all features of the natural environment equally influence the core of 
culture, and what #s important may be expected to vary from area to area. For the 
aboriginal groups occupying the Great Basin and adjacent portions of the Colum- 
bian and Colorado plateaus at the time of contact with Euroamericans, for example, 
Steward suggested that “‘the important features of the natural environment were 
topography, climate, distribution and nature of plant and animal species, and, as the 
areais very arid, occurrence of water’ (1938:2). He took the density and distribution 
of the population; the division of labor at sexual, familial, and communal levels in 
hunting, fishing, and seed-gathering; the territory covered and the time required 
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for different economic pursuits; and the size, composition, distribution, and degree 
of permanency of villages to be behavior patterns that were directly and strongly 
influenced by the nature of the environment, in the context of the technology 
available to exploit it (Steward 1938:2). 


His comments on the village locations of specific groups were based on 
conversations with informants who were recalling a lifeway that by that time was 
extinct and, usually, on visits to the areas in question. Many of these comments 
indicate which factors Steward considered to be determinants of site location. The 
Northern Paiute of Owens Valley, for example, lived in an area that was rich and 
diverse in comparison with most of the Great Basin. Their villages were relatively 
permanent and were situated on the alluvial fans of streams where these water- 
courses emerged from the canyon wall, about 2-4 mi from the Owens River. These 
locations afforded access to abundant water and were centrally located with respect 
to critical floral resources (except for pinon nuts) growing in or near the valley. Sites 
related to pinon nut extraction and use were located in the adjacent Inyo and White 
mountains and might be occupied during part of the winter in the event of 
abnormally abundant harvests. As important determinants for winter (or perma- 
nent village locations for all the groups he studied, Steward repeatedly mentions the 
availability of water, ample timber for houses, and fuel, and he also emphasizes 
avoidance of areas with unacceptably cold winter temperatures. Thomas (1973) 
used simulation to predict what the artifact dispersal patterns should be if Steward’s 
reconstruction of the Great Basin Shoshonean subsistence-settlement system ap- 
plied to precontact times in the Reese River Valley. Steward’s predictions, as 
operationalized by the simulation, were generally verified. 


Some of Steward’s views on the responsiveness of site location to environmen- 
tal factors will be systematized into a more general framework in Chapter 4 and 
hence are worth additional discussion here. To judge by Steward’s work, the 
locations of winter villages in the Great Basin ought to be relatively predictable on 
the basis of associated environmental features. For example, Steward characterized 
the entire Shoshonean culture as practically, even “gastrically,”’ oriented. Since 
Shoshonean groups were frequently at risk of starvation, their adaptation (broadly 
speaking, including the location of their settlements) was constantly exposed to 
selective processes. Social and political factors that may affect site location— 
defensibility; access to trade partners and routes; and economic, social, and political 
obligations to nonlocal groups—were of minimal importance in comparison with 
many areas in North America where warfare was more frequent, economic speciali- 
zation more pronounced, the family not the basic economic unit, and social and 
political groups more rigidly structured and less local. It will be argued in Chapter 4 
that this constellation of factors—which will be placed on the low end of a 
continuum of “intensification” —results in settlement behavior that is quite 
responsive to environmental factors. Moreover, the structure of the environment is 
such that the resources apparently affecting winter village location are relatively 
concentrated in space, overlap to a fairly high degree, and exhibit either high 
temporal constancy —meaning that they can always or nearly always be found in the 
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same place, as in the case of water and certain aquatic resources—or high temporal 
contingency —meaning that they are seasonally predictable. It will be argued in 
Chapter 4 that this kind of patchiness and this kind of temporal predictability make 
for high site visibility and high site predictability on the basis of environmental 
variables. 

The location of pinon-gathering stations, on the other hand, depends in part 
on the distribution of pinon resources, which in any year are relatively widely 
distributed, seldom overlap with other critical resources, and exhibit low temporal 
predictability. Logically, this environmental structure should lead to dispersed, 
poorly visible, and poorly predictable distributions for archaeological materials 
deposited during pinon exploitation. On the basis of these observations, and of 
Steward’s discussions, we would expect different parts of this settlement system to 
have differing visibility and variable degrees of predictability on the basis of 
environmental variables. 


Even this brief discussion of Steward’s approach and conclusions clarifies the 
continuity between inductive locational modeling and Steward’s work. Steward 
demonstrated that —at least for some site types and in some environments exploit- 
ed by some groups in the arid portions of western North America—there is good 
reason to believe that location was highly responsive to a relatively limited number 
of map-readable environmental determinants. In addition, he argued for a more or 
less one-way directionality of influence: from the environment, as exploited by a 
particular technology, to the culture core. Finally, although his research was 
influenced by a strong and consistent theoretical orientation, Steward argued that 
the particular aspects of the environment that are most relevant to adaptation 
(which is to say, to the composition of the culture core) have to be discovered 
empirically. 


People in Their Ecosystem: Post-Stewardian Developments 


Locational modeling — particularly in its inductive variety —normally assumes 
that certain environmental variables strongly influence site location. If settlement 
behavior can be considered to be part of the “culture core,”’ this assumption finds 
support in Steward’s cultural ecology. The strong, although frequently implicit, 
reliance of locational modeling on Steward’s theories or on other variants of what 
Trigger (1971) calls “deterministic ecology” makes the resultant models susceptible 
to the many criticisms to which Steward’s work has been subjected in the last two 


decades. 


One outstanding problem is an ambiguity in the definition of the culture core, 
which is noted both by Harris (1968:660-662) and by Kohl (1981:102). There is no 
rigorous objective procedure for determining what constitutes the core, and it 1s 
clear from Steward’s own statements that the core may occasionally encompass 
social, political, and even religious patterns. May we assume that all aspects of 
settlement behavior are core elements? If not, which aspects are? Another problem 
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is Steward’s assumption of an unrealistically unidirectional influence of the environ- 
ment on culture. A third problem is unrecognized complexity and variability in how 
the environment is perceived in different cultures (Brookfield 1969). 


Steward’s approach enjoys continued popularity among many practicing 
archaeologists, especially those involved in hunter-gatherer studies (Bettinger 
1980: 190). As a result of these problems, however, and perhaps also as a result of the 
increasingly sophisticated ecological studies of the last two decades, many human 
ecologists and some archaeologists have begun to abandon Stewar 1's “amework in 
favor of an ecosystem: perspective influenced by evolutionary ecology —a develop- 
ment that is more evolutionary than revolutionary. A very selective sample might 
include publications by Marston Bates (1953), J. W. Bennett (1946), Harold Brook- 
field (1968), J. G. D. Clark (1952), David L. Clarke (1968), Haro’d Conklin (1961), 
Kent Flannery (1968), Stanton Green (1980), Donald H2:desty (1975), Robert 
Netting (1974), Roy Rappaport (1971), and Bruce Winterhalder (1981), among many 
others. Although each of the researchers who has shifted to an ecosystems perspec- 
tive has unique points to make, Roy Ellen (1982:75-78) has attempted to summarize 
several characteristics shared by most workers involved in this reorientation of 
culture environment studies: 


1. Monism. Behavioral and environmental traits are analyzed as part of a 
single system. Culture becomes part of animal behavior, or at least it must 


follow rules that do not contradict those imposed by natural selection. 


2. Complexity. Significance and causality in this single, integrated system 
containing both the culture and the environment are “found in the web of 
finely interrelated factors rather than with general propositions at the level of 
gross categories” (Ellen 1982:76). 


3. Connectivity and mutual causality. “In the ecosystem view, all social activi- 
ties impinge directly or indirectly on ecological processes and are themselves 
affected by those same processes. Fauna (including humans), vegetation, soil 
structure, and microclimate are intricately related and mutually interdepend- 
ent (Ellen 1982:76). 


4. Process. In this systemic view of relationships the emphasis ts on the 
interaction of variables (for example, positive and negative feedback relation- 
ships) rather than on correlations between social and environmental variables 
at particular states of the system. 


5. Populations as analytic units. Local human populations replace societies as 
units of observation and analysis, a situation analogous with the ecological 
analysis of nonhuman populations. 


Local, detailed paleoenvironmental reconstructions are of special concern to the 
archaeologists involved in this reorientation, and this is a concern with which 
Steward would have been sympathetic. There is an increasing awareness that such 
information must not simply be brought in as an after-the-fact explanation for 
observed changes through time in human use of the landscape, as has long been the 











practice. Rather, settlement system studies should account in a dynamic manner for 
changing resource distmbutions related to changing climates (¢.g., Darsie 1983). 


The challenge to Steward’s approach posed by these advances is also implicitly 
a challenge to locational modeling as typically practiced. Future advances in 
locational modeling depend on our learning how to incorporate the mchness and 
complexity of the systemic perspective in our locational predictions. 


THE EMERGENCE OF SETTLEMENT PATTERN STUDIES 
IN ARCHAEOLOGY 


One important result of Julian Steward’s insistence on the importance of the 
local environment in the study of living (and recently living) cultures and of his 
interest in the location of ethnographic settlements was the development of studies 
of archaeological settlement patterns. The survey component of the Viru Valley 
program conducted in the late 1940s was instituted largely as a result of Steward’s 
influence (Willey 1953:xvii). Willey’s 1953 monograph about this work is generally 
regarded as having defined a new field of inquiry in archacology: 


The material remains of past civilizations are like shells beached by the retreating sea. 
The functioning organisms and the milieu in which they lived have vanished, leaving the 
dead and empty forms behind. An understanding of structure and function of ancient 
societies must be based upon these static models which bear only the mmprint of life. Of 


all those aspects of man’s prehistory which are available to the archacologist, perhaps the 
most profitable for suc! an understanding are settlement patterns. 


The term “settlement pattern” 1s defined here as the way mn which man disposed himself 
over the landscape m which he lived | Willey 1953:1]. 


Willey included within the scope of settlement pattern studies the nature of 
dwellings and their arrangement within settlements and the nature and distribu- 
tion of communal buildings. His discussion of the role of environmental, technologi- 
cal, and demographic change in affecting settlement patterns is not elaborate by 
modern standards; he was much more interested in how the community patterns of 
these large, late-prehistoric sites in Peru were affected “by various institutions of 
social interaction and control” (1953:1). 


Nevertheless, a field mas defined, and a series of papers (Willey 1956) published 
three years after Willey’s Viru Valley report contains many contributions emphasiz- 
ing the importance of environmental variables in determining the distribution of 
human populations across the landscape (¢.g., Haury 1956; Heizer and Baumhoff 
1956; Williams 1956). Other authors (¢.g., Sears 1956) were interested more in the 
social and political aspects of community patterning than in environmental rela- 
tions. In 1968 Trigger defined the various aspects of settlement patterns somewhat 
more rigorously than had previously been done, and he distinguished among the 
probable determinants of location for individual buildings, community layouts, and 
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“zonal patterns” (Trigger 1968). In the case of zonal patterns, he states that “the 
overall density and distribution of population of a region [are] determined to a large 
degree by the nature and availability of the natural resources that are being 
exploited” ( 1968:66). He notes, however, that broad economic (as opposed to simpie 
subsistence), political, religious, and defensive factors may also be important 
determinants of site location among agriculturalists. 


In the 1970s several important initiatives added new items to the list of 
environmental variables that archaeologists were willing to consider as possible 
determinants of location, and they also affected the ways that these variables were 
handled analytically. For example, the “situation” of a site (Roper 1979:11-14) or 
the putative “territory” of the community occupying it (Vita-Finzi and Higgs 1970) 
began to be scrutinized in addition to the more traditional on-site environmental 
characteristics. Catchment analysis, as this investigation is usually called, was 
designed to provide insight into the economic activities of the occupants of a site. 
Like most efforts to use the distribution of environmental variables in understand- 
ing site location, catchment analysis makes the joint assumptions that 


the most umportant transactions for most people were with the environment . . . [and 
that | humans tend to minimize the tome or effort expended m their economic transac- 
tions with the environment (or perhaps they mclad effort and tume expenditure as 
considerations m these transactions). In societies without advanced transportation these 
two factors —strong econom« coupling with the environment and minimization of tune 
and eflort —encourage location close to important economic resources [Kohler and 


Parker 1986:400; emphasis original}. 


Another important advance made in the 1970s was in the analysis of data. 
Steward himself had avoided statistical approaches, and following perhaps uncon- 
sciously in his footsteps virtually all settlement pattern studies for many years 
followed an anecdotal form. That is, the investigator called attention to apparent 
tendencies for sites to be located in areas having specific constellations of natural 
features, much in the same way that Steward did in his Basin-Plateau work cited 
above. Where these relationships were patent, the observations were probably 
correct, at least to the extent that the orginal surveys were not biased by an 
internalized model of “where sites should be.”’ Nevertheless, it was a great contri- 
bution to settlement pattern studies when the participants in the Southwestern 
Anthropological Research Group (SARG) helped to introduce a more rigorous 
testing procedure for determining the degree of relationship bet ween site locations 
and environmental variables. This procedure involves the creation of expected site 
distributions for comparison with observed site distributions, using formal statisti- 
cal inferential techniques. 


The SARG organization was dedicated to investigating systematically the 
question of why archacological sites (or, in some versions, prehistoric population 
aggregates) in the Southwest were located where they were (Plog and Hill 1971). 
The members of SARG began with the basic assumption that activities were located 
in such a way as to optimize the return on energy investment and then proposed 
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three somewhat more specific hypotheses for testing. These hypotheses suggested 
that activity loci were 
1. situated with respect to critical on-site resources, 


2. situated so as to minimize the effort expended im acquiring required 

quantities of critical resources, and 

3. located so as to minimize the cost of resources and information flow among 

loci utilized by interacting populations (Plog and Hill 1971:12). 

Most participants concentrated on the first two problems, and in his perceptive 
insider's view of the SARG research design several years after its inception, Dean 
(1978:107) suggests that this was due to procedural and logistical considerations. 
The difficulty of operationalizing and testing the third hypothesis would have been 
great. 

Plog and Hill’s suggested procedures for testing these hypotheses using null 
models and statistical comparisons of where sites were and were not located were 
rarely used by the SARG participants. More often, the SARG researchers concen- 
trated on searching for significant differences in site location frequencies across 
environmentally defined strata. The methods proposed by Plog and Hill have, 
however, become standard in cultural resource management and in some research 
contexts. The potential utility of this brand of locational research was clearly 
foreseen by Plog and Hill (1971:11): 


our research should lead to the ability to predict site locations (and something about 
organizational characteristics of sites) from the distribution of critic: resources and 
other critical variables. And, conversely, we ought to be able to pre ict the critical 
variables by examining the site distribution patterns. 


Some of the problems with the “critical resources” concept are noted in the 
Chapter 4 discussion of how variables are selected—in inferential or deductive 
models —as potential determinants of locational behavior. Hill (1971:58) suggests 
that critical resources are those “‘without which the system would collapse” (but 
see Sullivan and Schiffer 1978:172). Dean (1978:108) acknowledges that SARG has 
been primarily concerned with food resources and suggests that availability of fuel, 
structural wood, and other nonfood resources might also be important im determin- 
ing site location. 

While it is clear that those of us who are engaged in locational modeling owe a 
substantial debt to the SARG participants, it 1s important to call attention to a final 
comment by Sullivan and Schiffer concerning the difference between investigating 
the distribution and movement of people through space in the systemic, behavioral 
context and investigating the spatial distribution of archacological sites: 


[P\rehistorc peoples most likely did not locate “sites” anywhere. However, they did 
establish, occupy, and abandon behaviorally significant spaces, such as activity areas, 
camps, and settlements... . Sites are nothing but deposits of material remaims mm the 
environment that archaeologists recognize as being potentially informative about past 
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cultural behaveor and organization. . . . Owing to secondary deposition, multiple 
occupations, and other formation processes, sites usually are mot equivalent on a 
one-to-one basis to camps, settlements, or population aggregates | Sullivan and Schaffer 
1978: 169). 


The discovery of statistical associations between site types and environmental 
variables, they continue, may be potentially useful for developing predictive 
models for cultural resource management (CRM) and for evaluating survey sam- 
ples, but construction of such models “has little to do with the formulation and 
testing of behavioral principles” (1978: 169). 


THE ERA OF PREDICTIVE MODELING 


It is clear from the above citations that in the early 1970s there was already 
some talk about predictive modeling, although there were relatively few examples 
of what this term might mean. To avoid ambiguity, we can define a predictive 
locational model as a simplified set of testable hypotheses, based either on behav- 
ioral assumptions or on empirical correlations, which at a minimum attempts to 
predict the loci of past human activities resulting in the deposition of artifacts or 
alteration of the landscape. Thus defined, the potential applications of predictive 
models are certainly not limited to CRM contexts. Green (1973) conducted a 
locational analysis of prehistoric Mayan sites (defined as the loci of one or more 
structures) in northern British Honduras (now Belize). In this research she shared 
the SARG assumption that “sites were located so as to minimize the effort 
expended in acquiring critical resources” (1973:279). Several soil and vegetation 
variables, along with variables reflecting distance from navigable bodies of water (in 
the belief that access to commerce was a critical resource), were tested for associa- 
tion with counts of sites per unit area, using multiple linear regression. The 
resultant multivariate statistical model of site location was ‘nterpreted as predicting 
high probability for site location in areas with large tracts of good agricultural land 
and in proximity to trade routes. In a sample of 150 quadrats known to contain only 
22 sites, about 22 percent of the variance in the number of sites observed in each 4.25 
km? quadrat was explained by the independent variables selected by the regression 
routine. Quadrats with high negative residuals (no sites found, several predicted) 
were considered as probably containing undiscovered sites, and such quadrats were 
assigned a high priority for future survey efforts. Because sites were located in the 
centers of arable tracts rather than on their margins, Green inferred that residences 
were probably located so as to have garden plots in their ummediate vicinity. 


As predictive models began to be applied in CRM contexts, many still- 
unresolved issues concerning the appropriate use of predictive models were identi- 
fied almost immediately. 
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Predxtive models are probabilty statements; they are not “facts,” and cannot subst- 
tute for facts wn amy apphcation requiring the use of hard data about specefic wndivaduals 
as decmonmaking criteria. . . . 


The problem ss that some archacologiusts have told some planners that our predictive 
models can be used as hard data, when m actualsty at 1s owr hard data on sste location and 


sigmificance that must be figured mto the planner’s cost-benefit ratio. To substitute a 
scientific hypothesis (our predictive model) for scientific fact (actual site locatson) as a 
crtenon for a planning decision 1s to court disaster. 


There 1s only ome way for us to get the hard data for use m such decrmons: by an 
mtensive ground reconnarssance of the entire area to be affected by a proposed project 
[Wildesen 1974:1-2}. 


In the latter half of the 1970s the Bureau of Land Management, Forest Service, 
Corps of Engineers, Interagency Archeological Services, and some State Historic 
Preservation Officers were beginning to sponsor both surveys that would result in 
predictive models and attempts to build predictive models from data already 
collected (Interagency Archeological Services |1AS| 1976:3; King 1978:73). Although 
important federal historic preservation legislation dates back to the turn of the 
century (the Antiquities Act of 1906; the Historic Sites Act of 1935), the National 
Historic Preservation Act of 1966, amended in 1976 and 1980, has been of signal 
importance in this growth of predictive models, especially Section 106 of that act, 
which requires that federal agencies “take into account” the effects of their actions 
on properties eligible for the National Register of Historic Places (King 1984; Scovill 
1974). In conjunction with Executive Order 11593 (1971), oiher sections of the 
National Historic Preservation Act, the National Environmental Policy Act of 1969, 
and various implementing regulations, this statute gives federal agencies the 
“substantive responsibility to identify historic properties on their lands and nomi- 
nate them to the National Register, and to record such properties when they must 
be destroyed” (King 1984:116). Highly variable legislation for the protection and 
identification of archaeological resources also exists in state and local jurisdictions 
(Rosenberg 1984). 

Federal (and occasionally state) agency response to this legislation has 
included predictive modeling, under the assumption that it will be a long time (to 
say the least) before a total, comprehensive inventory of archaeological resources 
can be conducted on lands under their jurisdiction. 


For comprehensive planning, predictive survey may best be considered an ongomg 
process in which mncreasingly fine-tuned predictions can be made as more and better 
information becomes available. If the archacologust contmnues to survey anew selection of 
sample units every tome, he will eventually obtain a 100 percent sample. Thus 1s a rational 
goal for statewide comprehensive surveys and for federal agency surveys conducted 
under section 2(a) of Executive Order 11593. The advantage of predictive survey us that 
vom useful data for purposes of planning on the entire study area became av aslable almost 
mmediately .. . and st ws probable that all the mformation needed to carry out 


responsible preservation planning will be available before physical mepection has 
covered even 380 percent of the land [King 1978-92; emphasis orgynal]. 
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The flood of predictive models that appeared in the late 1970s shows that 
contractors were happy to respond to agency requests for such models, even though 
(judging by the vanaulity im techmques and products) no une was sure how 
prediction might besi be accomplished. Early attempts include Dincauze and 
Meyer (1976), Fuller et al. (1976), Hackenberger (1978), Robertson and Robertson 
(1978), Scott et al. (1978), Woodward-Clyde Consultants (1978), Holmer (1979), 
Barber and Roberts (1979), Burgess ct al. (1980), Kohler et al. (1980), Muto and Gunn 
(1980), and Senour (1980). 


A Taxonomy for Predictive Locational Models 


Betore we can begin to talk about the very dissimilar enterprises that have 
been called “predictive locational models” during the last 10 years, we need to 
establish some definitions and build a classification for what has been done so far. 
Another purpose for classification 1s to highhght what this author beheves to be the 
most significant dimensions of variability among the predictive locational models 
put forward to date. Specifically, | propose a classification with three distingurshing 
dimensions: level of measur: ment, procedural logic, and target context (Figure 
2.1). 


Many models for site location or settlement behavior are intuitive or not fully 
operationalized. The ugly word oprratinalization reters to the process of caretul 
definition of all the terms in a model in such a way that the same predictions can be 
made from a model by different people. If a model can be objectively, rephcably 
mapped, it 1s operationalized; a model consisting of the statement that “sites are 
located near rivers on dry, level ground,” for example, 1s not mappable until site, 
near, river, dry, level, and ground have been mgorously defined. 


As we move to the mght im Figure 2.1, we move from models with no 
measurement to models based on variables measured at the categorical or nomunal 
level (such as soil type) or ordinal level (such as resources ranked m order of 
hypothesized importance) to models based on variables measured at the mterval or 
ratio level (such as slope, distance to water, estimated net primary productivity, 
and so forth). There is nothing wrong with site location models that are not 
operationalized if they provide insights into settlement behavior, as does Bintord’s 
(1980) distinction, based on a review of hunter-gatherer subsistence and settlement 
system organization from around the world, between foragers and collectors. Until 
a model 1s operationalized, however, it cannot be mapped and cannot be used for 
management. This is one problem with the informal models of settlement pattern 
that are found in many Class | overviews based on existing literature and site files. 
The most important distinction along the dimension labeled ere of meauerement is 
between the box on the left, containing unoperationalized models, and the two 
boxes on the nght, containing operationalized models. 


The other two dime»: his classification — procedural logic and target 
context—need to be diss ..0 gether. Most predictive models wm cultural 
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Examples: 

1. Many unquantified discussions of prehistoric settlement systems m particular regions; also, Bintord’s (1980) forager collector model 

2. Many unquantitied discussions of prehistoric settlement patterns in Class | overviews 

3. Pilgram 1982 

4. Limp and Carr 1985 

5. Most cultural resource management predictive locational models: ¢.g., Kvamme (1980), Nance et al. (1983) 

6. 


Models based on optimal foraging theory (¢.g., Winterhalder 1983:207-208) and other model-based approaches (¢.g., Jochum 1976) 


Figure 2.1. A suggested taxonomy for the different types of locational model: that appear in the literature. 
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resource management have been inductive (used here synonymously with the terms 
inferential ot empirical correlative) in their logic. That is, they begin with survey data 
on the distribution of archaeological materials across the landscape in relation to 
some (usually environmental) features, and then they estimate the spatial distribu- 
tion of the population of archaeological materials from which the sample was drawn. 
The logical alternative to this procedure ts to begin with a theory as to how people 
use a landscape and to deduce from that theory where archaeological materials 
should be located. 


By target context I mean the “theater of operations” for the model. The 
systemic context (Schiffer 1972) is the dynamic living system observed by ethno- 
graphers and ethnoarchacologists. (Of course, it too 1s subject to inference, partial 
observation, and informant perception.) The sum total of the materials collected, 
altered, organized, and deposited by the participants in this system, and the spatial 
distributions of these materials, constitute the archaeological context (Schiffer 
1972). This context can never be directly observed, however, and as soon as we 
begin to sample materials from it, analyze them, and make interpretations, we enter 
the interpretive or analytic context (Kohler et al. 1985). Some of the processes and 
activities in each of the contexts are discussed in Chapter 4. 


In two senses inductive models automatically operate in the analytic context. 
First, to make predictions directly about the systemic context they would have to 
make some attempt to control for the effects of the postdepositional and deposi- 
tional processes that separate the analytic from the systemic context (see Chapter 
4); this is rarely, if ever, done. Second, and more insidious, the sampling and analysis 
processes of the analytic context are invisibly imbedded in inferential predictive 
locational models. Any inferential locational model predicts only what would have 
been found had the population of space from which the sample was drawn been 
surveyed in the same manner as was the sample, using the same rules for attribute 
coding, site recognition, and data analysis. Such inferential models predict neither 
the systemic interaction between a cultural system and a landscape nor the archaeo- 
logical context resulting from it; rather, they predict what we will find and how we 
wil! interpret it if we consistently follow a particular set of rules for fieldwork and 
analysis. For this reason J say that inductive models normally operate in the analytic 
context. The challenge for inductive models is to build the bridge to the systemic 
context by making the analytic methods (including discovery) as “*transparent”’ 
(non-bias-making) as possible and by controlling for the effects of depositional and 
postdepositional processes in the archaeological context. 


Deductive models, on the other hand, begin with some theory predicting 
human behavior in the systemic context. The challenge for deductive models is to 
build the bridge to the analytic context, which is where the outputs of the system 
can be observed. This bridge-building — whether from the systemic to the analytic 
context or vice versa—is referred to as explanation (see discussion in Chapter 4). 
Explanatory mods |s, as 1 suggest the term be used, are inherently neither inductive 
nor deductive. Instead, they are models that attempt to build the bridge between 
the dynamics of the living system and its observed outputs. 
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There is at least one sense, however, in which deductive models are clearly 
preferable to inductive models. Except when we are working with living groups we 
are limited to testing predictive locational models in archaeology im the -nalytic 
context. Thus, an explanatory, inferential locational model would end up making 
predictions about behavior in the systemic context that could not be immediately 
tested, although im a cycle of scientific inquiry these predictions could be used to 
suggest theory from which implications for future testing are drawn. An explana- 
tory, deductive locational model would result in predictions for the analytic context 
that would be directly testable. 


Examples 


A detailed review of even a small proportion of the predictive models of the 
past decade would take much more space than is available here. The only reasonable 
way to approach this mass of material is to pick a few themes and trace them 
through a highly selective sample of the available references. Discussions of sam- 
pling, statistical methods, use of remote sensing data, and use of geographical 
information systems are generally avoided here, as they are treated in detail 
elsewhere in this volume. The four models to be discussed here were chosen 
because they illustrate particular cells in the proposed taxonomy and because they 
focus on various geographic regions. 


I would suggest that some of the same criteria used to evaluate research 
designs and theory can be used to assess predictive locational models. One obvious 
criterion that should be applied is the accuracy of these models. Do they supply 
reliable predictions? Unfortunately, this information is available for so few models 
(see Appendix) that other, more general guidelines need to be considered. This, in 
itself, underlines the need for additional attention to model testing. refinement, and 
verification. In the discussion below of examples of predictive models, | have 
foltowed Blalock’s (1979) suggested criteria for judging what constitutes good social 
science theory in general. 


1. Generalizability. Generalizable models can be applied to large areas, 
rather than small; are applicable to different adaptations and environments, - 
rather than just to one; take into account the entire settlement system, rather 
than just part of it; and have implications for human organizational systems in 
general as well as prediction of site locations in particular. Generalizability has 
both a conceptualization component —are the theoretical arguments applica- 
ble across a broad range of situations? —and a comparability component —if 
our theories can be applied across a broad range of situations, can our 
measurement operations be guaranteed to be applicable in the same broad 
range of circumstances (Blalock 1982:29)? 


2. Simplicity. Other things being equal, a simple (or parsimonious) model is 
to be preferred to a complex one. After all, one reason people make models in 
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the first place is that the real world is too complex to be readily and unambigu- 
ously understood. 


3. Internal Consistency. Like other models, predictive locational models must 
be mathematically and logically consistent. 


4. Precision. Precision refers to the fineness of detail in the predictions. 
Precision may involve spatial detail: are predictions made to the square mile 
or the square meter? Or it may involve content: how fine-grained are the 
predictions of what will be found in various locations? Are various possible site 
types, periods, or assemblage types differentiated? Other things being equal, 
a model that is precise in its predictions 1s to be preferred to one that 1s not. 
5. Falufiability. \t must be possible to prove that a model 1s wrong. 


The last two of these characteristics can be lumped together for convenience, 
since a model that is not precise in its predictions cannot be falsified. Internal 
consistency is a more or less mechanical problem that needs no further mention here 
(but see Kohler and Parker 1986:398). There are, however, severe and perhaps 
unresolvable conflicts among generalizability, precision, and simplicity in predic- 
tive modeling, as in social science theory in general (Blalock 1982:27-31). 


A Predictive Land-Use Model for North-Central Washington 


In an overview based on a survey of existing literature, Robert Mierendorf et 
al. (1981) first constructed a predictive model for site location in a large study area 
encompassing the corridors of two proposed transmission lines and then carried out 
a “sensitivity analysis” for the predicted archaeological resources in these same 
areas. The sensitivity analysis was designed to predict the likelihood that disturb- 
ances in different geographic zones will significantly impair the research value of 
predicted archaeological resources, given the predicted regional research value of 
these resources, their density, and previous disturbances in each zone. | will 
consider only the predictive aboriginal land-use model in this discussion. 


If we have to fit this model into one of the pigeonholes shown in Figure 2.1, it 
would probably be best to call this an inductive model aimed at the analytic 
(archaeological) context, at a nominal level of measurement, although to the (rather 
large) extent that the authors rely on an ethnographic model, it could be argued 
that this is primarily a deductive approach. There is no formal statistical model for 
site location, type, or density, but the model was operationalized to the extent that 
a map could be made. To the extent that the model construction relied on data from 
archaeological site excavation and survey, it is fair to call it inductively based. The 
model aiso takes into account the seasonal distribution and density of resources, 
however, and draws on recent hunter-gatherer studies. In some places it apparently 
(and wnplicitly) assumes a least-cost solution to location of settlements in cases of 
conflicts between the location of resources. For example, many researchers assume 
that storage of fish and roots was necessary in order for human inhabitants in the 
Columbian Plateau to survive the harsh, resource-poor winter months in a rela- 
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tively sedentary fashion. In this study area, fishing and presumably fish and perhaps 
root storage were concentrated along the large rivers, the Columbia and Okanogan. 
These same river valleys, however, were probably unfavorable winter locations 
from the point of view of adequate shelier from severe winter winds and the 
availability of wood for fuel. Mierendorf et al. assume that in decisions about winter 
village locations priority would be given to distributions of landforms providing 
shelter from winter winds and to the availability of fuel, which is a bulky, heavy 
item im comparison with stored food. 

The predictive model is based on a vegetation map and a set of topographical 
contour maps. The model recognizes six broad zones of archaeological resource 
types and densities (Mierendorf et al. 1981-90): 


1. Summer hunting and gathering zone; low density. Areas supporting 
summer hunting of dispersed ungulates. The highest elevations, which have a 
mesic vegetation and are accessible only in the summer, are mapped as part of 
this zone. 

2. Summer and fall hunting and gathering zone; low density. Areas support- 
ing dispersed ungulate hunting in the summer and fall. Intermediate eleva- 
tions with a xeric vegetation are included in this zone. 


3. Spring, summer, and fall (on map) and winter (in text) hunting and 
gathering zone; low density. Low-elevation, steppe vegetation zones not 
included in any of the other categories are mapped in this zone. These areas 
are relatively accessible in winter. 


4. Summer fishing camp zone; high density. Areas within 10 km (6.2 mi) of 
falls and rapids on the Columbia and Okanogan rivers and mouths of tributar- 
ies to these rivers are included here. Catchment sizes are modified to reflect 
steep river valleys, resulting in a linear distribution for this zone. 


5. Winter residence zone; moderate to high density. Areas in which stands of 
timber, protected canyons and valleys, and water resources are available 
within a 5 km (1.3 mi) radius of each other are mapped in this zone. 


6. Overlap of fishing camp and winter residence zones; high density. 


Generalizability. This model is intended to be applicable to a study area in 
north-central Washington that covers more than 21,000 km? (8000 mi?). The tem- 
poral scope of the model is assumed to be the entire local prehistoric sequence. Its 
applicability to other areas may be slight, inasmuch as it relies on local ethnographic 
analogs and archaeological data for its predictions. 

Simplicity. The model is moderately parsimonious in its selection of inde- 
pendent (causal) variables. Three different types of variables (shelter, fuel, and food 
resources) are considered for their possible effects on the locations of three different 
site types. Both on-site and catchment-area variables are considered. The model 
gains simplicity but loses realism and precision by not incorporating changing 
resource distributions due to changing climates and changing adaptation types due 
to intensification. 











PREDICTIVE MODELING: HISTORY AND PRACTICE 


Precision. The model gains precision by considering seasonal distributions of 
resource types and by identifying differing site types and densities. On the other 
hand, the very large study area, the rather poor quality of available maps of 
important resource distnbutions, and the hand-measurement techniques all con- 
tribute to low spatial resolution in prediction. It is hard to imagine, particularly, 
how the distribution of the winter village zone could be accurately mapped using 
these manual techniques. The authors themselves call attention to these shortcom- 
ings (Mierendorf et al. 1981:84, 94). 


In many ways this study is exemplary among the “overview ° documents that 
attempt to predict prehistoric land use. Most such overviews result in unoperation- 
alized models that remain at a verbal, unmapped, unmeasured level, somewhere in 
the far left-hand box in Figure 2.1. It also avoids too heavy a reliance on existing 
survey records that (if typical of most areas) are biased toward certain types of sites. 
This is achieved by giving more weight to natural resource distribution than to the 
existing site data base and by building a reasonable model for the use of those 
resources by using the ethnographic record. Even granting unlikely climatic stabil- 
ity assumptions resulting in unchanging resource distributions, the danger in such 
an approach, of course, is that if adaptation types other than those present in the 
documented ethnohistory were ever present, they will not be identified or pre- 
dicted by such a model. 


A weakness that this model shares with most overview documents is the 
absence of attempts to validate statistically the variables selected as probable 
determinants of site location. Of course, in cases where no existing data base 1s 
available or where the existing data are irretrievably flawed, this is the only possible 
approach. In other cases, however, there should be an effort to build a null, random 
model for the location of archaeological resources for statistical comparison with the 
actual distributions. Impressionistic isolation of determinant variables should be 
avoided since it may result in the use of variables whose significance cannot in fact 
be demonstrated or in the failure to use variables whose significance could be 
demonstrated. Even if the selected variables are the correct ones, the model will not 
be convincing to those who have other subjective impressions of the determinants 
of site location. 


A Hierarchical Chowe Model for Site Location 


Before moving to predictive models based on deductive, optimizing assump- 
tions and inductive models involving substantial analysis of ratio-level data, a brief 
discussion of an approach to settlement location analysis proposed by Limp and 
Carr (1985) will be useful. Their model should probably be categorized as a 
deductive approach to the systemic context, on an ordinal level of measurement 
(Figure 2.1). These authors propose that people make decisions about anything, 
including location of activities, by ranking the available alternatives into sets of 
equal preference value and then randomly selecting an alternative from the possibil- 
ities in the highest available preference set. This “general theory of rational choice” 
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was derived by Arrow (1951). The ordering of available options into these prefer- 
ence sets is based on “conditional preference aspects” —those aspects of the 
environment (broadly speaking) that directly bear on choices. When there ts more 
than one “‘choice-making™ aspect to be considered, it is assumed that the alterna- 
tives are evaluated in a sequential, hierarchical fashion by the decision-maker. In 
this framework an unfavorable aspect of a location (¢.g., no water or too much 
water) cannot be mitigated by another, favorable aspect, as could happen in a linear 
additive model. 

One key decision to be made in the analytic context when using this model 1s a 
choice as to how many preference sets should be assumed to have been in use for 
each choice-making aspect. If there are only two preference sets for each variable — 
satisfactory and unsatisfactory locations—the approach is formally identical to a 
“satisficing”’ approach (Simon 1957), as used by Williams et al. (1973) in the Great 
Basin, for example. As the number of sets that need to be ranked becomes greater 
than two for each variable, the framework approaches the optimization called for by 
classical marginalism: large numbers of bits of information have to be considered by 
both the decision-maker and the analyst. Intermediate numbers of preference sets 
imply an ordinal level of measurement. 


Limp and Carr (1985) present a few brief examples of how this framework can 
be applied in different settings. They convincingly argue that hierarchical choice 
analysis is a realistic model for how people make decisions, since it does not assume 
that they can make, or wish to make, perfect calculation of return rates on every 
variable for every possible location. Nor are the data requirements in the analytic 
context as huge as for an optimal foraging theory model, for example. The hierar- 
chical decision process assumed by this framework does not lend itself to easy 
discovery through any presently available computer algorithms, however, and i 
certainly cannot be reconstructed by such linear additive models as multiple linear 
regression, for example (Kohler and Parker 1986:428-430). 


Generalizability. Because of its flexibility and its explicit reference to the 
systemic context, this model has very great generalizability. It has the ability to 
bring all kinds of choice-making aspects into consideration, not just those related to 
food resources. Indeed, one of the problems with the approach 1s that it 1s so very 
general that it gives few internal guidelines as to how it might be applied to a 
specific area. How many choice-making aspects should we expect? Where should 
the “break points” for a ratio-level variable like distance to water be established for 
each preference set, and how do we know this? Can an inferential technique be 
devised to reconstruct hierarchical decision frameworks from a distribution of 
points with and without archacological resources? These are important questions 
that need to be addressed before application of such models can advance very far. 


Besides these operational difficulties, we may ask to what extent it is appro- 
priate to view all, or most, site locations as the result of “free” decisions in the 
systemic context. Kohler and Parker (1986:432-438) have identified a number of 
constraints on choice, instances in which “rational” decision rules are violated, 
cases where there is extreme lag in response to changing environmental determ- 
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nants, and other factors that make wt dulicult to analyze site location as though 
were the outcome of simple, rational decisions. Then too, it will be suggested in 
Chapter 4 that settlement systems have a kind of mternal logic that has little to do 
with individual or even group decisions at particular moments in time. Despite 
these very real problems, it 1s not easy to see how human behavior can be analyzed 
and predicted mm the systemic context without considering how and why people 
make decisions. 

Sempliaty and preciaon cannot be evaluated for this example, since they depend 
on particular apphcations of the framework. 


An Inferential Model for Site Location in Central and Soutbeastera Utab 


The next case was selected as an example of the most common approach to 
predictive locational modeling mn North America and particularly mn the and West. 
This is an infereniial multivariate predictive model, operating on a ratio level of 
measurement and targeting the analytic context. This example, im common with 
many others that could be mentioned, 1s the result of a Class I (sample) cultural 
resource inventory —in this case, three tar sands areas im Utah (Schroed] 1984; Tipps 
1984; Appendix, this volume). 


For the larger two of the three study areas a two-phased random sample of 
quarter-sections was drawn, selecting 5 percent of the population on the first round 
and an additional 5 percent on the second round. (The third area was simply 
sampled at 10 percent, since st comprised only seven 160-acre quadrats.) The 
sequential samples were actually surveyed at the same time, but the results were 
recorded separately so that model building and model testing and revision could be 
conducted using different sets of data. Survey intensity and means of distinguishing 
sites and isolated finds are explicitly described in the report. For each site, probable 
age and cultural affilhation were recorded, and the site was classified into one of 10 
descriptive site types (for example, pithouses, rockshelters, and lithic scatters with 
features). A second functional classification, more useful for explanatory purposes, 
was devised by evaluating exght criteria for the 158 sites components in the sample: 
diversity and size of the tool assemblage 
maximum density of artifacts 
frequency of debitage (lithic debris) 
site size 
number of features 


type of features and amount of labor investment represented 


presence of trash or midden deposits 


8. presence of stratified deposits 


The first five of these variables — those measured at the ratio level — were analyzed 
using principal components analysis (see Chapter 5). Four groups emerged on the 
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two significant factors, and these were interpreted as representing the major 
functional types suggested by Binford (1980) for logistically organized hunter- 
gatherers. The non-ratio-level variables were used to check the site classifications; 
these variables usually supported the type assignments made on the basis of the 
principal components analysis. 

Site location analysis began with univariate descriptive frequencies for all sites 
im each study area with respect to clevation, aspect, slope, distance to permanent 
water, primary and secondary landform, depositional environment, primary and 
secondary vegetation, and primary and secondary geology substrate. 

One nice feature of this report 1s the discussion of how point estimates and 
confidence mtervals for the total population of sites im cach study areca were 
calculated (Tupps 1984). It us relatively rare for confidence intervals to be calculated, 
whuch ts a waste of one of the main advantages of random design adopted by most 
surveys. T»pps also warns her readers, quite correctly, that in two of three samples 
the amount of skewness relative to the sample size may lead to confidence mtervals 
that are misleadingly narrow, using the normal parametric estimation techmiques 
employed (for a discussion of statustical terms used here, see Chapter 5). 

Two separate predictive models were developed (Schroedl 1984). One of these, 
mcorporating Landsat mmagery, turned out not to be very formative and will not 
be discussed further. Predictive models were constructed only for the two larger 
survey areas, which were somewhat more similar to cach other than they were to 
the third area, and the two larger areas were pooled for purposes of analysis. 
Disappointingly , the functional identification of sites carefully worked out carher in 
the report was not used for locational analysis and prediction, probably because of 
sample size considerations imposed by the inferential approach. (Division of the 
total pool of sites mto its constituent classes significantly reduced the sample size in 
each class, which mm turn makes it less likely that significant relationships with 
environmental variables will be discovered.) Nor is there any analysis of the location 
of the considerable number of wolated finds recorded during the survey. 


The model-building process went through several preluminary stages. In the 
first, nine variables were used im a discriminant analysis to find the best linear 
function differentiating between sample quadrats from the initial 5 percent samples 
that contained, or did not contain, sites. Distances were measured from the center 
of each 160-acre quadrat. The directional aspect was broken into two components to 
avoid the problem typically associated with measurements im circular degrees. (A 
symptom of this problem is that 359 and I° are very similar measurements.) The 
variables were 
difference between the maximum and minimum elevation mm each quadrat 


distance to nearest permanent water 

percentage of the quadrat covered by pihon-jumiper 
number of dramages within the quadrat 

average quadrat elevation 


FP FP FP PP = 
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6. distance to nearest nver 
7. dastance to nearest wooded arca 
8. north-south aspect 


9. casi-west aspect 

In the first two-group analysis, only the first four variables were selected by the 
stepwise procedure used for construction of the one discriminant function. Reclassi- 
ficatson of the quadrats on which the function was based mto their original groups 
(sites, no sites) was 73 percent successful; classification error rates for the quadrats of 
the second 5 percent sample were about 10 percent higher. These results are 
somewhat lower than, although within the range of, other semilar attempts tabu- 
lated by Schroedl ( 1984-155). A second stage of refinement, which involved discard- 
ing three outhers from the analysis and using more of the sample quadrats im the 
imitial classificatson- building portion of the discrummant analysis, improved these 
results; two addition.’ vanables (5 and 6 above) also contnbuted to the lincar 
discriminant function. 


The final analysis employed all of the sample quadrats and discruminated three 
groups of quadrats: those without sites, those with one site, and those with more 
than one site. Reclassification rates were quite high but, of necessity, were based on 
the same sample for which the functions were obtained im the first place. In a 
three-group solutson there may be one or two significant linear discriminant 
functions; there are two im this example. The first, explaining about 40 percent of 
the total variance, showed that high-clevation quadrats with relatively large pro- 
portions of pinon-junsper contained a larger number of sites than low-clevation, 
unwooded quadrats. The second function, which explained about 12 percent of the 
total variance, was orthogonal to the first; that is, this function exploited a dimen- 
sion of variability uncorrelated with the high elevation high pinon-juniper vs low 
elevation low pinon-juniper dimension. Apparently there were several quadrats 
that had a relatively high number of drainages but were not significantly higher in 
elevation than those having only a few dramages. These same quadrats were also 
located a long way from a river and tended to contain only one site; they were 
differentiated from quadrats with no sites or with two or more sites along this 
dimension. 


Generaltzatility. \t seems probable that this solution exploits a good deal of 
variability peculiar to this particular sample; would be surprising if the second 
dimension of variability turned out to be typical of much of the intermountain 
West. The first dimension is much more general; a similar discriminator could 
probably be found mm many areas at simular elevations mn the intermontane region. 


Simplicity. The final predictive model, in the form of the classificatory equa- 
tions derived from the discriminant analysis, allows unambiguous classification of 
any quadrat from the spatial population into one of the three groups on the basis of 
measurements on six variables (the original nine variables less distance to wooded 
area and the two aspect determinations). 











Preanoe. The 160-acre quadrats do not allow for very precise prediction of 
sate location. The author pomts out, quite reasonably, that achseving higher spatial 
precimon over large areas 1s extremely time consuming without the use of such 
computenzed data-collection ads as geographx information systems (see Chapter 
10). Nor are the predactions very fine grained m terms of the types of sites predicted 
to be present. Some gain in precision in terms of the number of sites predicted for 
unsurveved quadrats 1 achieved by the three-group solution, m contrast to the a 
pron site nonste classes used by most analysts. There ss little reason to expect, 
however, that the local environment im quadrats with one site should be opposable 
to that mn quadrats with more than one site along a contsnuum that 1s at mght angles 
(or uncorrelated with) the contenuum that distinguishes between quadrats with no 
sites and quadrats with many sites. Some functional dsfferentiation im site types 1 
almost certainly being exploued here, and the results maght have been even better 
had this distinction been taken mto account for prediction. 


An Optimal Foraging Theory Model of Site Location for the Northeastern 
Continental Shelf 


Barber and Roberts (1979) present both an inductive and a deductive approach 
to the difficult problem of estimating site types and densities on those portions of 
the continental shelf from the Bay of Fundy m Mame to Cape Hatteras m North 
Carolma that are now submerged but were exposed at or after 18,000 BP. Although 
they face unusual measurement problems because of the nature of thei study area, 
them conceptualzation problems are the same as those for a dry-land model. Only 
thei deductive model — based on optimal foraging theory — will be discussed here; 
see the Appendix for a summary of the entire project. 

Optimal foraging theory models are derived from fundamental assumptions im 
evolutionary ecology and population genetics in which change m the relative 
frequency of traits mm a population is interpreted as being duc to differential 
imclusive fitness among the individuals in that population. From this perspective, 
the goal of behavior should be to maxomuze the individual's proportionate contribu- 
tion to the genotype of the next generation. Unfortunately, inclusive fitness is 
difficult or umpossible to measure, but it may have correlates that can be measured. 
Optimal foraging theory assumes that the net rate of energy captured by an 
mndivedual (or some sumilar measure) 1s such a correlate, and that a will be maxi- 
mized by selective forces (Smith 1980:58). 


There has been an extended discussion about the apphcability of such models 
to human populations. Those cultural ecologists who accept the “monism”™ dictate 
discussed above consider these models to be clearly relevant. Enc Alden Smith 
(1980: 12-15) pounts out that there us a middie ground between two extreme positions: 
(a) that cultural processes are perfectly analyzable in terms of general evolutsonary 
models, with the only meaningful distinction being that cultural evolution 1s more 


rapid and more finely tuned; and (6) that cultural processes are shaped by purely 
cultural goals that have no necessary congruence with biological criteria for 


adaptation. 
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Models for the location of behavior based on optimal foraging theory share 
some sumularities with the general choice theory used by Lamp and Carr. Since they 
deal in decisions, both operate within the systema context. The hicrarchical choice 
methods essentially specify bow choices are made (choice mechanisms), however, 
while optimal foraging theory also specifies why choices are made (chosce goals). In 
one sense the approach advocated by Lamp and Carr is more gencrakzabie, since 
goals other than optimizing food mtake can be accommodated. Optimal foraging 
theory 1 more complete, and perhaps more useful, however, since it contaims 
internal guidelines to predict exactly what choices will be made given an array of 
information on resource costs. Both use a deductive logic for prediction. 


The information needed to apply and test optimal foraging models is difficult 
and expensive to collect, and a has not been casy to test such models, even m 
modern ethnographic contexts (but see Smith 1980; Winterhalder 1983). In the 
archacological context the problems are multiphed immensely. These problems are 
particularly serious for the application discussed here, since no detailed paleoenvi- 
ronmental maps are available for the inundated continental shelf. For some of the 
resources, return rates have been experimentally estumated by Perlman (1976). 
Since the ngorous quantification of net resource yields called for by optimal foraging 
theory was umpossible for most resources, the authors dichotomue the major 
potential food resources along two dimensions: the probable importance of the 
resource, based on grossly estimated caloric return rates (primary vs secondary 
resources), and the degree to which location in the immediate vicinity of the 
resource ws necessary for efficient exploitation of that resource (determinate vs 
indeterminate resources). Shellfish, for example, have relatively low return rates 
and are therefore secondary, but they are localized in space and have a large amount 
of waste weight, which would encourage location of sites mm the vicinity of the 
resource (Barber and Roberts 1979:306). The resources characterized m this manner 
are shown mm Table 2.1. 


The authors recognize that the immediate predictions made by optimal 
foraging theory concern what resource patches will be exploited under what 
conditions. Locations of settlements, therefore, are one order of inference removed 
from the predictions that optimal foraging theory is designed to make. The spatial 
resolution of predictions 1s so low for this particular model, however, that this may 
not be a ral eafier dieser fener enprerebertserr ee full coastal, 
estuarine, inland valley, and upland. Given that the locations of these zones change 
during marine transgression, Barber and Roberts separate the extremely long 
period of mmcerest (beginning at 18,000 BP) into six 3000-year segments. They also 
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subdivide the north-south expanse of contunental shelf mto three subareas: Masne, 
southern New England, and Mid-Atlantac. 


For cach penod, m cach subarea, predsctions are made concerning the proba- 
ble ste suze, ste density, and to a lmited extent, ste type m cach of the four 


environmental zones (a portion of one of them tables 1s reproduced here as Table 
2.2). The authors assume that ste uze os correlated with population size; dispersed 
populations will be found m areas with “predictable, mobile, and evenly distributed 
resources,” leading to small sites. Aggregated populations and, consequently, large 
sites will be found om areas with unpredictable, mmmobile, and clumped resources 
(Barber and Roberts 1979-316). The effects on site ure of such vanables as duration 
of occupation and location reuse are mot conmdered. Site density, m turn, 
conmdered to be a function of the “relatewe attractiveness of the several environ- 
ments for explowation™ (1979-317) and so us predicted only on an ordinal level within 
each penod, for each subarea. Barber and Roberts mtend these projections of scte 
size and frequency to be suggestive; they do not beheve that more precise estimates 
could be calculated rehably using available mformation. 


Gener altz ability and Presson. Models mm which both decision mechanisms and 
decison goals are fully specified by theory seem to provide the only consistently 
deductive, truly mgorous formulation for predicting site location. For optimal 
foraging theory models the resources actually used must be inferred for each specific 
application, and return rates for these resource must be calculated for cach case. 
Once these inferences and calculations have been made, however, all predictions as 
to resource use then follow automatically from the theory uself. This is im contrast 
to the rational chowe theory approach described above, or to the satisficing 
approach, where preference sets or acceptability criteria saust also be discovered 
inferentially or made up using rules of thumb. 

When optumal foraging theory models are used to predict the locations of 
activities resulting wn the deposition of archacological materials, the explicit focus 
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on the spatial distribution and return rates of food resources (only) is a two-edged 
sword. There is no reason to doubt that there will be a general correlation between 
the distributions of archaeological materials and the distribution of exploited 
resources; after two decades of settlement pattern analyses this is no longer a 
surprising conclusion, or even one worthy of research in itself. Considerably more 
work is needed, however, on predicting exactly what these materials will be, how 
they were deposited, and what their relationship was to other materials elsewhere 
on the landscape. This task will require consideration of more than the distribution 
of food resources. 


Optimal foraging theory assumes that all humans are foragers. In Chapter 4 we 
will argue that, since not all humans are foragers, the degree of intensification 
affects the organization of the settlement systems, and this in turn determines how 
spatially predictable the sites generated by that system will be on the basis of 
variables in the natural environment alone. For example, in the case of foragers we 
might expect that many resource patches—especially if they overlap spatially in 
their temporal availability with other nonsubstitutable resources and are relatively 
isolated rather than continuous in their spatial distribution —will support residen- 
tial bases. These same resource patches, however, may be visited intermittently by 
specialized task groups in a logistically organized subsistence system. In still more 
intensified systems, variables other than the distribution of environmental resour- 
ces become increasingly important in the location of residences and other site types. 
We need to begin trying to make predictions with more specificity about how 
human settlement systems interact with the environment —not just where undif- 
ferentiated sites or materials will end up on the landscape, but what kind of use 
these represent in the systemic context. 


The lack of behavioral (and spatial) precision is no fault of these particular 
authors, who suffered more severe measurement problems than most. No large 
predictive locational models have ever been constructed with great behavioral 
specificity. These considerations are relevant here, | believe, because if such 
specificity is ever to be achieved it will be through a deductive approach to the 
systemic context, using detailed reconstructions of the resource availability in the 
paleoenvironment. 


The generalizability of optimal foraging theory models for human use of the 
landscape is limited by their relatively low degree of “portability” across different 
adaptation types, especially those of increasing intensification. The precision with 
which these models do what they were explicitly designed to do—predict foraging 
exploitation of resource patches —is probably high in the ideal case, although this is 
difficult to test. In the particular example discussed here, measurement problems 
interfere with high precision. 


Simplicity. Optimal foraging theory models are wonderful in the simplicity of 
their design and the economy of their assumptions. In fact, it is this very simplicity 
that prevents them from being more general. What is not simple, however, is 
handling the mass of ratio-level information necessary to rigorously map a predic- 
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tive model based on this theory. For such purposes a geographic information system 
(Chapter 10) seems essential. 


I do not mean to minimize the shortcomings of optimal foraging theory, 
particularly as they might affect the accuracy of prediction. One such shortcoming 
is the assumption that each resource patch, and the landscape as a whole, will be 
used at its maximum capacity, when in fact hunter-gatherers typically do not 
expand their populations to the carrying capacity of the region. Another is that 
cultures frequently have high-status resources (and conversely exhibit food taboos) 
that do not have any obvious relation to resource abundance or caloric content. 
Readers should consult Martin (1983, 1985), Sih and Milton (1985), Hawkes and 
O’Connell (1985), Yesner (1985), and Smith and Winterhalder (1985) to capture the 
complexity of the recent debate on issues surrounding application of optimal 
foraging theory to human societies. 


Discussion 


Generalizability. One clear conclusion emerges from these four examples: 
deductive theories of settlement location that work from first principles have 
considerably more potential generalizability than do specific models designed for 
particular areas and derived almost entirely through empirical procedures. Thus, 
the framework of decision theory and analysis discussed by Limp and Carr (1985) 1s 
very generalizable; the optimal foraging theory framework is somewhat less gener- 
alizable but can still be applied to differing environments and adaptation types. 


The inductive or inferential framework, as an overall strategy, us very generaliza- 
ble. An inductive model can be constructed for any area that has a partially known 
archaeological o« ethnographic record. But we must differentiate between a strat- 
egy fer ansly ew a prediction (inductive generalization vs deductive implication) 
and a nude! explaming or predicting site location. An optimal foraging theory 
model can be applied in any area; only the structure of the environment in question, 
and the resources actually used, change. Each new inferential model starts from 
scratch: of the infinity of variables that might have affected how people used space, 
which actually did? 


Precision. There is no inherent difference between inferential and deductive 
models in their potential spatial resolution of prediction. As it happens, none of the 
models discussed above had finely resolved spatial predictions, although some 
inferential models (for example, those discussed by Kvamme later in this volume) 
do. There is more to precision than spatial resolution, however. How fine-grained 
are the predictions of site type or of the cultural and natural forces at work in the 
formation of the archacological record? Are predictions made about assemblage 
content? As such questions approach behavior in the systemic context more closely, 
it becomes more natural to frame them in a deductive manner. 


Simplicity. The discussion of the two deductive models for site location 
suggests that there is a general trade-off between simplicity and generalizability. 
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The optimal foraging theory model is more parsimonious but less generalizable than 
hierarchical choice theory. Inferentially constructed models are not necessarily 
more parsimonious than deductive models. Although the examples used here shed 
no light on this question, Limp and Carr (1985) suggest that a few processes can 
generate a multiplicity of forms. Since inferential models dea! with forms, and 
deductive models with processes, the latter may prove more parsimonious. 


CONCLUSIONS 


Some of whiat has been said above seems to favor deductive approaches over 
inferential approaches to the problem of predicting types and locations of archaco- 
logical materials, and the same will be true of the method and theory discussion in 
Chapter 4. And yet, while models are classified one way or another here for 
taxonom«c purposes, it is evident that nether purely deductive nor purely induc- 
tive models are possible. In the first case, we would not know how to apply the 
model to a particular area; in the second, we would not know what variables should 
be considered for inclusion in the analysis. 


Much of this book will be devoted to discussing the kinds of inductively 
derived models that constitute most current efforts in archaeological predictive 
modeling. While empirical correlative models can be very useful in specific cases, in 
this chapter and in Chapter 4 we would like to balance the picture somewhat by 
suggesting that deductive explanatory models should have greater utility in the 
long run. Both the manager and the researcher want predictive models that are 
useful, after all, and as Blalock (1979:120) points out, there are several ways that 
utility can be defined in such a context. One of these is in the significance of what we 
learn through the application of the model. 1 think that nearly everyone will agree 
that it 1s more significant to learn something about borh the systemic and archaco- 
logic contexts at the same time than it is to learn about the archacologic or analytic 
context alone, as 1s so often the case for inferential models. 


Another indication of utility is whether the application of the model results in 
predictions that go beyond those that could have been made by common sense or 
by a casual examination of the phenomena in question. As long as we couch our 
analyses in terms of casually observable variables (for example, a dichotomy 
between site presence or absence) it will be hard to transcend common sense 
predictions, such as the prediction that sites will cluster around resources basic to 


human needs. 


A third potential criterion for utility 1s the generalizability of a model to other 
times and places. In fact, until such time as we begin to gather rehable estimates of 
model accuracy, | suggest that we strive to build models that are both generalizable 
and precise. If generalizable and precise models can be constructed, | think we will 
find that accuracy will take care of itself. 
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Chapter 3 


MODELS AND THE MODELING PROCESS 


Jeffrey H. Altschul 


In Chapter 2 4 model was defined as “a semplified set of testable hypotheses.” 
Researchers mvestigating a particular phenomenon create a model by wolating 
various components of the phenomenon and then posstung (or hypotheszing) a 
relationship of senes of relationships among them. The result 1s a semphtied version 
of the phenomenon that memics, m a general way, the events or behaviors m 
question, 


Oae of the utulties of a model os that st 1s possible to hypothesuze how changes 
mm one of more components will affect the final state of the phenomenon; that 1s, one 
can predict what the phenomenon will “look lke” given specified changes m 
particular components. All models are predictive om thes sense. It us mmportant to 
emphasize, however, that prediction ts not synonymous with explanation and that 
predictive accuracy alone us not necessarily the best mmdicator of a model's utility. 
For mstance, the old adage 


Red sky at meaght, sasor’s delght, 
Red sky m morneng, saber take warneng 


1s a pertectly vald predictive model of the weather. Based on a single observation 
ome can predut whether or not there will be a storm m the ummediate future. 
Now here 1s  umphed that the color of the sky explains why the weather us the way 
st a8; the only uemphcation ws that a partecular condition wall occur based on a certam 
ubserv ation. 


An explanatory model of the weather mmght mvolve a senes of diflerential 
equations deduced trom, theoretical propositions relating a pressure, relative 
hummdity, wind currents, and the hke, and st us quite possible that the predictive 
success of thes model might be less than that of the old saslors’ adage. The chonwe 
be ween these two models would depend on one’s goals. Looking at the sky might 
be the best approach of one us interested semply om predicteng the ummediate weather 
conditions. It, on the other hand, one wishes to understand the process, then i 
would be far better to reanalyze the ternal logic of the second model m hopes of 





ALTSCHUL 


refining the hypothesized relationships among components and ultimately produc- 
ing a higher success rate. 

A samilar situation exists with models that are used to predict the locations of 
archacological sites. If one is simply interested in predicting whether a location will 
or will not contain a site, then in many areas of the world a highly successful 
predictive statement would be to say that no individual location contains a site. 
This conclusion is based on the fact that sites are relatively rare “events” and cover 
only a minute fraction of the earth’s surface. For example, two surveys conducted in 
conjunction with predictive modeling in the mountainous sections of the western 
United States showed that in at least these cases a “‘no site” prediction would have 
been nght more than 99 percent of the time (Kvamme 1983; Reed and Chandler 
1984). 


Cultural resource managers and archacologists, however, are less concerned 
with the overall predictive success rate of a model than with the likelihood of a 
wrong prediction. Basically there are two types of predictive errors: a prediction 
can be made that a location (or area) contains a site when in fact it does not, and 
conversely a prediction can be made that a location does not contain a site when in 
fact it does. The first type of error may lead to increased costs or to inefficient use of 
resources and will be called a wasteful error. Errors of the second type lead to the 
destruction of cultural resources and will be termed gross errors. 


The errors defined above can be associated with the classical Type I and Type 
Il errors defined by Jerzy Neyman and Egon Pearson in a series of papers in the late 
1920s and carly 1930s (e.g., 1933a, 1933b). As these statisticians pointed out more 
than a half century ago, in a hypothesis-testing framework there are always two 
potential errors: we may reject the null hypothesis when it is in fact true (Type 1) or 
we may accept the null hypothesis when it is false (Type Il). To relate these errors 
to predictive modeling, we can take as the rull hypothesis that an area will not 
contain a site. If we reject the null hypothesis when it is true (i.¢., accept the fact 
that there is a site when there is none) we are committing a Type | error or, as it may 
be viewed from a management perspective, 4 wasteful error. If, on the other hand, 
we accept the null hypothesis that a site does not exist in the area when inder J one 
does, then we are committing a Type Il error, which we have more forcefully ..<.ned 
a fross error. 


An ideal predictive model minimizes both types of errors; that ts, it makes 
correct predictions. In practice, however, models do make wrong predictions. In 
this regard, we can make two observations. First, in general it is much more costly 
in cu:tural resource management to make a gross error than a wasteful one. Second, 
the likelihood of making a gross error is inversely related to the likelihood of making 
a wasteful error. To see the logic of the second point, one needs to understand that 
the primary means of reducing gross errors is by increasing the amount of land 
predicted to contain sites. But unless site location can be predicted with no errors 
(which is highly unlikely), this procedure will increase the number of wasteful 
errors. 
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The choice between two models, then, has less to de with overall success than 
with minimizing errors, especially gross errors. In general, a more powerful predic- 
tive model is one that for a specific proportion of gross erors to total predictions also 
minimized the area predicted to contain cultural resources. Let us assume, for 
example, that there are two predictive models of site iocation for the same region, 
Model A and Model B. When both models predict that 5 percent of the region will 
contain sites, predictions derived from Model A are found to be correct 70 percent of 
the time, while those from Model B are correct 80 percent of the ume. Our first 
inclination would be to conclude that Model B is 4 superior predictor. Let us say 
that upon closer examination, however, we find that in all its predictions Model A 
makes only 5 percent gross errors while Model B makes 10 percent. For most 
management purposes, then, Model A is twice as good as Model B (for additional 
discussion of these two types of modeling errors, see Chapter 8). 


TYPES OF MODELS 


Until now the discussion has proceeded 2s though differences in types of 
models were not important. While it may be true that any model that satisfactorily 
minimizes errors can be a useful predictor, the form of the model will determine in 
large part the confidence placed in it and one’s willingness to make it even better. 


The scientific literature is replete with discussions of models, modeling, and 
prediction (e.g., Braithwaite 1960; Hempel 1965; Kaplan 1964; Salmon 1971; Scriven 
1959, 1962; Zetterberg 1963). During the past two decades archaeologists have also 
become increasingly interested in these subjects (Binford 1972, ed. 1977; Clarke 
1968, ed. 1972; Earle and Christenson 1980; Flannery 1968, ed. 1976; Fritz and Plog 
1970; Gardin 1980; Read 1974; Renfre’ , 1973; Rentrew and Cooke 1979; Renfrew et al. 
1982; Salmon 1975, 1976, 1978). Archzeological models range from simple analogs to 
complex simulations. Although the properties and forms of the various types of 
models differ in important respects, a more fundamental distinction, which bears 
directly on any discussion of the types of models used to predict site location, can be 
made. 


in general, models can be divided into two groups based on the degree to 
which they can be operationalized. Those that contain componenis or relationships 
between components that can:sot be measured in a replicable and reliable manner 
will be termed intuitive modeis, whereas those with components that can be so 
measured will be called objectime models. Objective models are further distinguished 
on the basis of (a) the spatial referent of the dependent variable (i.e., whether 
aspects of site location for an area or specific locale are being predicted), (6) the 
predominant form of procedural logic (inductive or deductive), and (c) the nature of 
internal relationships among model components (i.e., whether independent varia- 
bles are given equal weight or relative weights). On the basis of these criteria, three 
categories of objective models can be defined: associational, areal, and point-specific 
models (Table 3.1). 








63 





ALTSCHUL 








TABLE 3.1. 
Types of objective predictive models of site locations 
Primary Procedural Logu Verte ‘Spatial 
Model Type Inductere Deductire Weight Referent 
Associational Overlay or composite Adaptive types E 
models Q 
U A 
A 
L kK 
E 
Areal Map interpolation Simulation R A 
Pattern recognition Discrete probability E L 
Grid prediction distributions 
Hierarchical decision . 
models A 
Point-specific Pattern recognition Central place models ' 
Pownt-specific Gravity models I Oo 
prediction Optuumum location models y ! 
Polythetic-satisficer N 
models E T 





The classification presented above differs from the one presented in Chapter 2. 
The previous typology was based on three criteria: the level of measurement of the 
independent variables, the modcl’s procedural logic, and the target context. Here 
our primary concern is not with the level of measurement but simply whether the 
measurements are made in a consistent and replicable way. For models that can be 
operationalized in an objective manner, interest now shifts to the form of the model, 
that is, to the relationships among the internal components and to the nature of the 
dependent variable. 


Intuitive Models 


Intuit:ve models can be derived through either inductive or deductive logic, 
with the reference frame being either the archaeological record or patterns of 
human behavior. An example of an intuitive model is the statement, “*You'll find 
arrowheads on high ground near water.”’ This statement may be based on repeated 
observation or on acommon-sense theory about human behavior. But regardless of 
whether the statement is based on inductive observation or on deductive thinking, 
the most important characteristic of this model from a scientific standpoint 1s that 
the components are not fully conceptualized. While everyone may understand the 
thrust of the statement, there will not necessarily be agreement on what is high 
ground or what “‘near water” means. The relationship(s) among landform, distance 
to water, and artifacts is only partially established. Until everyone can agree on 
what the terms mean they cannot be operationalized in a way that is replicable. 
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Until the variables are operationalized they cannot be measured, and without 
measurement the relationship(s) cannot be tested. 


Many archacologists might contend that mtustive models are not really models 
at all, reserving that term only for constructs that can be measured and tested. 
Leaving aside the philosophical issues, there is good reason to consider mtuitive 
thought in a discussion of predictive inodeling. Much of the recorded archacological 
data base m the United States was derived through mtuitive models. American 
archacologists have only recently concerned themselves with formalizing their 
notions about site location into research designs. Many archacologists have sur- 
veyed and continue to survey land based on their ideas about where they will find 
sites. Moreover, these mtuitive models are often the basis for more intensive 
research projects. For example, in the carly 1970s the Corps of Engineers began 
plans for the development of Sardis Lake, a reservoir covering about 1400 ha in 
southeast Oklahoma. The agency commissioned a survey that consisted of one 
person trying to find as many sites as possible in a |-month period (Neal 1972). The 
survey vas based on personal intuition and reports from amateurs and resulted in 3! 
sites being recorded. These sites, along with six others recorded later, formed the 
basis for 10 years of intensive excavation. 


Not only have intuitive models been the basis of much professional work, they 
have been the mainstay of amateur archacology. As a result, recorded site locations 
in most areas of the Unuted States do not necessarily reflect where sites are located 
but only where people have looked for them. Models of site location based on 
existing data can lead to predictions with very high accuracy rates. After all, if 
people have only looked for sites in certain types of places, then it is inevitable that 
site locations will be highly correlated with specific environmental attributes. This 
is not to say, however, that all data collected on the basis of intuition must be 
ignored. Procedures for reducing the biases inherent in this type of data do exist 
(e.g., subsampling and weighted analysis) and will be discussed in Chapter 7. 


it ts umportant to remember that intuitive models are not examples of bad 
science or of bad thinking. Indeed, creativity and intuition are the most important 
and most illusive parts of the scientific process. The first question many archacolo- 
gists ask themselves prior to designing a survey for a region is, “If 1 were a 
prehistoric inhabitant, where would I live?” The problem is that many archacolo- 
gists stop there and never formalize their answer. Thus, no matter how brilliant 
their insight or how many sites they find, no one can objectively evaluate how well 
their model works. 


Objective Medels 
Associational Model; 


Often archaeologists are interested in determining whether patterns exist in 
the data. For instance, suppose a survey records 25 sites in a 1000 ha pinon-juniper 
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zone and 10 sites in an adjacent 2500 ha sagebrush zone. The first question asked by 
an archaeologist might be whether the difference in site frequency between the 
vegetative zones is greater than would be expected if there were no association 
between site location and the vegetation. One approach to answering this question 
would be to compute a goodness-of-fit statistic. If the value obtained exceeded a 
specific level of a chi-square distribution, the association could be considered 
significant. 

If it were determined that a significant association existed, the results might be 
used as the basis for a simple predictive model. It might be predicted, for example, 
that in another study area more sites would be found in the pinon-juniper zone than 
in the sagebrush zone. If this prediction were based solely on the patterning 
observed in a single survey, our confidence in it would be fairly low regardless of the 
strength of the association or the proximity of the two study areas. Confidence in 
the expected outcome might be greater if this prediction were based on 15 surveys 
in nearby regions, although we still would not be in a position to express our 
confidence in a quantitative fashion. 


Models similar to the one described above are common throughout archaeol- 
ogy. Many predictive models developed in cultural resource management studies 
take the form of relatively simple pattern-recognition, associational models. For 
mstance, Kohler e: al. (1980) conducted an intensive survey of the Halloca Creek 
drainage, which consists of about 2 percent of the area of the Fort Benning Military 
Reservation in Georgia. Twenty-one prehistoric and 10 historical sites were found. 
Site locations were examined to determine whether they covaried with six envi- 
ronmental variables. To evaluate the relationship between soil type and site 
location, for example, the observed numbers of sites per soil type were compared 
with the distribution expected if there was no relationship. After computing the 
appropriate chi-square statistic, the investigators concluded that the relationship 
between site distribution and soil type was nonrandom. 


In a similar fashion Kohler and his colleagues examined the associations 
between site location and vegetation, distance to water, slope, relative elevation, 
and distance to roads. The results suggested that the distribution of sites was 
nonrandom in relation to slope, soils, and horizontal distance to water and that it 
was random relative to the other variables. For each significant environmental 
feature, the investigators defined a variable with two states, favorable to site 
location and unfavorable to site location. A map of each variable was created for the 
entire military reservation, along with a composite map on which the three varia- 
bles were overlaid. Areas where favorable values for all three variables intersected 
were considered high-probability zones; areas with two favorable scores were 
defined as medium-probability zones; and the remaining areas were considered 
low-probability zones. 


Associational models like the one described above are among the most com- 
monly used predictive models in cultural resource management (¢.g., Campbell et 
al. 1981, 1983; Chandler et al. 1980; Grady 1980; Klesert 1982, 1983; Larralde and 
Nickens 1980; Reed and Nickens 1980; Thomas et al. 1981). These models are 
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attractive primarily because of their simplicity; they are easy to construct and 
relatively straightforward to understand. They are not without their problems, 
however. For one thing, it is simply not true that the intersection of several 
favorable values for environmental variables will necessarily be a better predictor of 
archaeological site location than the individual variables themselves. The intersec- 
tion 1s only a more precise predictor if the variables are independent of one another, 
which is highly unlikely with environmental variables. For instance, well-dra:ned 
soils are only associated with certain types of landforms and with a restricted 
number of vegetative communities. Each of these variables mdiriduaily may be 
highly correlated with site distribution, but before it can be concluded that the 
predictive power of the model will be increased by using all three simultaneously it 
has to be shown that site distribution is associated with each variable atter cont rol- 
ling for the influence of the other two (see Chapter 5 for an extended discussion of 
spatial autocorrelation and statistical independence). 


A second major problem with associational models is generalization. For the 
most part, associational models have been developed as part of Class | overviews or 
using the results from surveys of management-selected areas. They are usually not 
derived from probabilistic sample surveys and thus may contain biases that will be 
magnified if the model is generalized (i.¢., extended to areas that have not been 
surveyed). 


The predictive power of this type of model, and certainly the generalizability 
of associational models, would be increased if the suggestions concerning the 
associations between site location and environmental attributes were not based 
solely on pattern recognition but instead were deduced from principles of human 
behavior. One would then be in a position of demonstrating that an association 
between site location and an independent variable or set of independent variables 
exists, as well as being able to explain why the association exists. 


From a research perspective, explanation is our ultimate goal; only when we 
can explain why the phenomenon occurs can we be said to truly advance our 
understanding of human behavior. Deductively derived models, however, are also 
superior from a management point of view. If we do not understand why patterns 
occur, our confidence that they will reoccur in the future will always be somewhat 
tempered. This is especially true when we deal with human behavior. The assump- 
tion that settlement locations were conditioned by environmental features may be 
valid in a general sense, but it will not explain why sites are frequent in one river 
valley and rare in another. Pattern-recognition models often show that settlement 
distributions are highly patterned, but without some sort of explanatory frame- 
work, management decisions based on these patterns are grounded more on taith 
than on reason. 


There are only afew examples of deductively based associational models. One 
such model was developed by Sabo et al. (1982; see also Sabo and Waddell 1983) in a 
cultural resources overview for the Ozark -St. Francis national forests in Arkansas 
These investigators used the concept of adaptation type to model successive prehis- 
toric and historical human ecosystems in the Ozark Mountains. An adaptation type 
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relates regional environmental potential to specific levels of socioeconomic and 
technological organization. Sabo et al. (1982) defined four prehistoric and 14 histori- 
cal types in the Ozarks, ¢.g., Late Pleistocene Early Holocene hunting and gather- 
ing, and Late Holocene horticultural, hunting, and gathering adaptations. 
Expected archaeological site types and their distributions within four major physio- 
graphic zones were derived for each adaptation type. The predictions were tested 
with 254 previously recorded sites. For each site, attributes of four environmental 
variables were recorded. Q-mode cluster analysis resulted in groups that corre- 
sponded to the predicted site classes. 

In general, the Ozark-St. Francis model is more convincing than a pattern- 
recognition associational model, but it would be even more convincing if the 
adaptation types were not so broad. One cannot avoid the sinking suspicion that, 
given the conceptual framework, virtually any result could be viewed as consistent 
with the model. The general approach, however, is in the nght direction. 


Has the emphasis on associational models in cultural resource management 
contexts really been misplaced? The answer seems to hinge on the stated objec- 
tives. Associational models provide a means of operationalizing the environmental 
variables that may be related to site location. In this sense they are a tremendous 
improvement over intuitive models. Associational models can be used to provide a 
first guess about site location and as a basis for future research; they can, for 
instance, define environmental dimensions that will be useful in stratifying a region 
for a Class I] survey. Associational models, then, can be a good first step, but hardly 
a step at which to stop. 


Areal Models 


Areal models are those that predict certain characteristics of sites or cultural 
resources, such as density or frequency, per a specified unit of land. For the most 
part, areal models are more attractive than associational models because the latter 
only produce reiative statements about site location, such as “more sites will be 
found in this area than in that one” or “more sites are found in this zone than would 
be expected by chance alone,” and these statements are often inadequate for 
research or management needs. In many instances researchers and managers want 
to know more than just the fact that one zone will contain more sites than another; 
they want to know how many sites each zone will contain and what the site density 
in each zone will be. 


Answers to such questions lie in the area of estimation, that is, deriving a 
reasonable estimate of an unknown characteristic of sites and/or of site distribution 
in a specified region on the basis of a sample of that region. This issue falls under the 
topic of sampling, which will be discussed in more detail in Chapter 6. Because of the 
close association between sampling and many forms of areal models, some archacol- 
ogists have viewed predictive modeling as synonymous with sampling for the 
purpose of parameter estimation (¢.g., Ambler 1984). There are, nowever, good 
reasons for keeping the two separate. Parameter estimates are based on assumptions 
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about how the population of characteristics 1s distributed and how that population 
is sampled. When some type of probabslistac sampling design 1s used (1.c., when the 
sampling techmque, frame, fraction, unit, ctc., are specified), an estumate of a 
population value can be computed. Whale this waluc us the best guess or prediction 
of the population value, « must be remembered that a 1s not characteristics of the 
populations that are being modeled but characteristics of the sampiung distribution. 
That 0s, sampling theory makes no statements about how the population was 
derived (in thes case, about how sites become located im specific places). Instead, 
sampling theory only allows us to determune the hkehhood that a particu.ar sample 
would result given a certam hypothess about the underlying population. 


Predictive models of site location (as they are being defined here) all use some 


aspect of site location as the dependent variable that us being predicted by one or 
more independent variables. In areal models the nature of the relationshup between 
the independent variable(s) and the dependent onc i usually determined for 
relatively small areas, and this same relatsonshep us then proyected to exest m larger, 
more inclusive areas. Although this notion of projecting from a sample to a larger 
population 1s semular to parameter estemation, many areal models are generalized on 
some basis other than probatulity theory. 


Kriging, tor mstance, 1s a techmque for generalizing that uses the concept of 
spatial autocorrelation—the presence of a characteristic m one area makes its 


presence in adjomnumng areas more likely (see Chapters 5 and 7). Basically a method of 
map interpolation, knging uses moving averages and involves estumating values, 
and the errors associated with those values, for spatially distributed variables. 
Although kngrung has been most extensively used m trend analysis en geologic 
mineral deposits, Zubrow and Harbaugh (1978) have provided several examples of 
how this techmique can be used to predict site densities on the basis of samples. In 
one example they simulate how an archacologist can divide an area into grid units 
and then, using his her intuition about where sites are located, survey 12.5 percent 
of the grid units most likely to contain setes. A knige analysis of the results produces 
site density estumates for the entire region that are reasonably close to the true 


values. 


Kriging and other map interpolation techmques, such as trend surface analy - 
sis, have been largely ygnored as bases for predictive models m cultural resource 
management, probably because most archacologists are not well versed m these 
techmques. Whatever the reason, it 1s fait to say that the potential of models based 
on map generalization has not been realized. Models of this type could be especially 
useful at the Class | or overview stage of work (¢.g., Hansen 1984), 


One of the most popular types of predictive models used mm cultural resource 
management is an areal-based pattern-recognition model. Although differing wm 
form, most of these models utilize sample data to compute a mathematical function, 
which is then used to predict some aspect of site location (¢.g., presence absence or 
site density) for unsurveyed units. A varnety of statestical cechmaques have been 
used in these models, including multuple linear regression, discriminant function 
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analysis, and logistic regression, but whatever the statistical techmique the logic of 
these models 1s the same. 


The predictive model developed for the Bist:-Star Lake region of northwest- 
ern New Mexico (Kemrer 1982) 1s a good example of this type of model. The 
Bisti-Star | ake region 1s located in the San Juan Basin and consists of various tracts 
of coal leases totaling approximately 191,500 ha (77,500 acres). The modeling 
approach adopted was to use the results of six previous surveys to create a 
predictive model. On the basis of these results, Kemrer and his associates devised 
exght site classes (Table 3.2), each of which served as the dependent variable in a 
separate predictive model. 

Landsat multispectral satellite data were then used to classify soil and water- 
source characteristics of the area into eight “environmental classes."’ Adopting an 
approach similar to that often used in remote sensing, the investigators used a 
sample of training pixels (in this case equivalent to an area of about 50 by 70 m) with 
known environmental characteristics to obtain a mathematical function by which 
unknown pixels throughout the area could be classified. In this manner very fine 
scaled environmental data were obtained. 


The next step was to place a2 by 2 km grid over a map of the Bisti-Star Lake 
region. Seventy-eight “environmental” variables were then calculated for each grid 
square. Exght of these were simply the number of pixels per unit for cach of the eight 
environmental classes. A second set of exght variables consisted of the proportion of 
pixels per grid square classified into each class. The remaining 56 variables repre- 
sented all unsque two-way interactions between frequency and proportional varia- 
bles, respectively, of the eight environmental classes. 





TABLE 3.2. 
Bists-Star Lake region site classes 
Sete Claws Desc raption N amber 
Lat hac undiagnostic lithic scatters 410 
Anasari sites dated to Basketmaker 1) - Pueblo M1, as well as all 

sites considered to be Anasaz: but not assigned to phases 178 
Pre-1933 Navajo Na‘ ajo sites dating trom the late 1600s to 1933 146 
Post-1934 Navajo Navajo sites datong trom 1933-1980 358 
Total Navajo all Navajo sites combined (enchudes those that could not be 

assigned a date) WY 
Anglo Spanish hestorscal sites dateng trom 1700-1940 ' 
Unknown historncal = hestorical sites that could not be affthated with a specif 

group i4 
Total all sites combines: 1174 





Fromm Kermrer 1962-42 
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The 2 by 2km gnd squares were then used as units of observation for which the 
dependent variables (the mumber of archacological sites of a specific site class per 
unit) and the independent variables (the 78 variables based on different methods of 
associating pixels of each spectral class per unit) could be measured. Linear equa- 
tions were developed for cach site class using a multiple linear regression formula. In 
essence, these equations served as predictive models so that if the values for the 
eight environmentally related pixel variables could be determined for a grid unit, 
the number of sites of each site class could be predicted. 


The models were tested with aata derived from a 15 percent sample survey of 
the Bisti-Star Lake region. Areas surveyed were not chosen through a probabilistic 
sampling design but were instead purposely selected to test the entire range of 
variability in cultural resource density. Based on the discrepancies between pre- 
dicted and observed numbers of sites, the models were refined by recalculating the 
linear equations with the survey data. 


Models such as the one described above have recently become very popular m 
cultural resource management (¢.g., Gordon et al. 1982; Kranzush 1983; Lafferty et 
al. 1981; Morenon 1983; Nance et al. 1983; Newkirk and Roper 1982; Peebles 1983; 
Sessions 1979). Much of this pupularity is probably due to the ease with which these 
models are created and to their apparent predictive power. T wo inherent problems 
of these models should be pointed out, however. First, as with many spatial analytic 
techniques, grid size affects the results. The »odels developed on the basis of a2 km 
grid in the Bisti-Star Lake region differed subs: antially from those based on | km 
squares in nearby regions (compare Kemrer 1982 with Sessions 1979). Studies in 
other areas have also shown that widely differing results can be expected as the grid 
size 1s altered (e.g., Kranzush 1983), and thus far no one has been able to resolve this 
issue for a particular region, to say nothing of the general case. 


A second problem, which 1s also related to grid size, has to do with the 
characterization of the environment. Most often the environment of each unit 1s 
characterized on the basis of one or, at the most, a small number of points in each 
grid unit from which environmental variables are measured. These points are 
argued to be representative of the environment of the larger gnd unit. This 
approach is difficult to justify even for small units (say 40 acres or less) and simply 
misleading for large units. Commonly this approach leads to inaccurate predictions. 
For instance, Kranzush (1983) found a relatively high frequency of sites in 40-acre 
units that were predicted not to contain any. She notes that in many cases the 
center point of the unit (from which the environmental variables were extrapo- 
lated) may not have been suitable for settlement but one could usually find at least 
one spot in the unit that was suitable. 


The approach developed by Kemrer (1982) for the Bisti-Star Lake region 1s an 
innovative solution to this problem (see also Tipps 1984), but the use of Landsat 
images to create environmental variables is not without its difficulties. The devel- 
opment of an environmental data base at pixel-level resolution requires not only 
appropriate aerial photographs but also a detailed understanding of the statistical 
procedures involved. For instance, Landsat classes that can be accurately mapped 
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are often extremely broad owing to the poor spectral resolution of the sensor. Areal 
models based on such classes, then, may be of little use to the land manager. Yet, 
even when fine spatial resolution is achieved the information is often wasted 
because the pixel daca have to be aggregated mto larger units so that they can be 
comparable with other independent environmental vanables mapped at cruder 
resolutson. These and other conuderations will be discussed at length im Chapter 9. 


Although mductive procedures, such as map imterpolation and pattern recog- 
mitson, are the most common bases for areal models, such models can also be 
developed deductively on the basis of theoretical propositions about human settle- 
ment. Heerarchical decrmon models (Limp 1983a, 1983b), simulation models (Chad- 
wick 1978; Thomas 1972, 1973), and probability distribution models (Hodder 1976; 
Hodder and Orton 1976; Thomas 1972, 1973) are all examples of areal models of this 
type. As a group these models are more diverse than other categornes previously 
discussed. Although they vary widely mn their internal logic and procedure, they do 
share a common emphasis on explaining why humans settle mm certaim areas and not 
im others. 


Theory -based, deductive areal models have not recerwed much attention im 
cultural resource management studies, probably for one or more of three reasons. 
First, theoretically based models require more time to create. The mternal connec- 
tions between warables must be exphcily stated, as must the logical arguments 
supporting those relationships. Second, validation procedures are more oncrous. 
Deductive models must demonstrate that they are not only consistent with the data 
but also more parsimomous than any alternative. In contrast, inductive models are 
judged prumarily on the accuracy of their predictions. No claum ts necessarily 
forwarded about how the population was formed in these models, only that the 
dependent variable covanes with one or more independent variables. 


Some archacologists contend that all pattern-recognition models are based on 
the assumption that the environment shapes decisions about where humans settle; 
thes assumption is almost always implicit im these models, and the relationship 
between environment and human settlement is never specified. Although theory- 
based models also assume a relationship between environmental factors and settle- 
ment, the relationships between environmental factors and locational behavior are 
spelled out according to some behavioral theory. Thus, these models are cast to 
critique than those based on the generalization that er-ironment 1s related m some 
unspecified way to settlement. 


Finally, the predictive statements derived from some types of deduciive 
models are not of a form that is useful for management purposes. For mstance, 
probability distribution models yield statements about the expected number of 
sites per sample unit, but this type of model will not predict which wits will corvtain 
sites. Instead, such models predict that im the aggregate a specified number oi units 
will contam no sites, a certain number will contain one site, and so on. 


While the three reasons cited above may account for the less extensive use of 
deductively based areal models in cultural resource management, they are not good 
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reasons. The fact that deductively based models need to be more explicat and are 
more difficult to validate should not necessarily be viewed as adetrument. Clearly, a 
model that has successfully gone through this process has much more research 
utility than one that has not. Even from management's perspective, there us good 
reason to keep a balance between inductive and deductive modeung. Inductive 
models, as they are currently used in cultural resource management, may provide 
useful day-to-day information. They are not, however, designed to provide deep 
meight into the relationship ‘tween humans and the environment. Yet st 1s these 
latter relatsonshaps that underlie, albeut umphcitly, all inductive models being used. 
In contrast, deductive models have not to date performed well m providing 
on-the-ground miormation tor making management decssons. But research mto 
these models is one of the prime mechanisms of forwarding our understanding of 
man-man and man-land relatsonships that affect the spatial arrangement of human 
settlement. Emphasizing one approach to the excluson of the other us the surest 
way to styrme the potential of predictive modeling m general. 


Point-Specifuc Models 


In the past few years there has been a growing trend to shift the level of 
prediction from the sampling unit to the site itself. Instead of making predictions 
about the number of sites in a sampling unit, archacologists have explored methods 
of assessing the hkelhood that any particular spot will or will not contamm a site. The 
appeal of such an approach to both management and research is ummense. Not 
surprisingly, pomt-specific models have become the predominant form of site 
locational modeling within the BLM’s cultural resource management program (c.¢., 
Burgess et al. 1980; Kvamme 1983; Larralde and Chandler 1981; Peebles, ed. 1981; 
Reed and Chandler 1984). 


Pattern-recognition point-specific models m archaeology are based on proce- 
dures developed im the field of remote sensing (see Chapter 9 for an m-depth 
di.cussion of thas subject). In remote sensing, screntists use reflected radiation 
values to classify locations of interest on the earth's surface imto prespecified groups, 
such as forest vs nonforest, wheatfields vs nonwheatfields, and so on. In the sumplest 
terms, they first calibrate a “traming set” of known cases, such as vegetation types, 
by measuring different spectral bands; then for other cases, locations with unknown 
vegetation types, the different spectral characteristics are used to infer vegetation 
types. The validity of such classification schemes is evaluated using test data that 
were not used to calibrate the orgimal model. 


A sumilar approach has been adopted in archacology, using numerical classifica- 
tron techmques like discrominant function analysis and logistic regression. The 


predetermined groups are defined on the basis of certain combinations of discrum- 
nating variables, so that if the same vanables are measured for an unknown case it 
can be placed with a specified degree of probability ito one group or another. 


In addition to adopting the numerical classification techmques, many 
archaeologists have also borrowed the concept of a binary response variable. That 
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is, given locations are classified as being a ate of a somate. This s unfortunate, 
because all swtes are lumped mto one category. No distinction 5 made between bg 
and small sites, functionally distinct sites, or sites from different tume penods. This 
unrealistic, sence nearly all anthropological studies mdicate that a particular 


configuration of environmental variables is not equally mmportant m all temporal 
and functional contexts. With the site nonsite dichotomy, however, sites are esther 


present or absent, and all sites are created equal. From a mznagenal perspective, 
information on difierent types of sites may not only be mmportant but required. 


Clearly, different management strategies are required for small lithic scatters and 
for large ceremomial centers. 


There are also statistical problems associated with lumping all sites unto one 
group. These will be discussed at length m Chapters 5 and 7. Suffice ut to say here 
that these problems fall into two groups. The first has to do with the use of what are 
usually heterogencous groups m mathematical models that assume that the groups 
being used are mternally homogencous. For example, discrimimant analyses 1 a 
popular modeling techmque m which two or more groups are statistically distin- 
guished from one another. If there us only shghtly more between-group variation 
than within-group variation, the results will be largely useless and can even be 
highly misleading. Lumping site classes together almost always mcreases within- 
group variability of the ute group, often to such a degree that sites are more 
dissimilar to cach other than they are to nonsites. 


The second set of problems mvolves generalization. In the case of seal 
pattern-recognition models using probabulistically selected sampling units, general- 
izing the results us relatively straghtforward. The sampling unit us the sarac as the 
sample clement, and parameter estumates can be computed following formulas for 
clement sampling. This is not the case for pomt-specific models, since the sites 
found within the sampling units are used as the units of analysu. Thus, the sample is 
a cluster sample, and unless the appropriate adjust ments are made wn calculating the 
group variances and covariances, there are likely to be serous errors m the 
computation of the mathematical function (see Chapter 6). 


The preceding discussion does not mean that ai/ pattern-recognition poimt- 
specific models are maccurate or lead to mvalid predictions. Given the strong appeal 
of these models and the recent emphasis placed on them, however, it s umportant tu 
discuss the problems that can arise. One solution to some of these problems would 
be the development of a response variable with multuple categones. The creatron of 
multiple groups does not mvaldate the use of such techmaques as discrimmant 
analysis of logustic regression. It semply makes them more realistic, flexible, and 
amenable to management and research concerns. The problem of generalization can 
be mitigated by careful attention to how the model will be used. If its sole purpose is 
to act as a heurmtic device, pomting out patterns of covanation between the 
environment and site location, then problems associated with generalizing the 
results are probably not as critical as of the predictions were to be used ax the basis 


for management decisions. 
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From a theoretical standpoint, the most powertul locational models should be 
those that not only predict where sites are located but explain why they are located 
there as well. Models of this type include central place models (Berry 1967; Berry 
and Pred 1965; Christaller 1966, 1972; Crumley 1976; Haggett 1965; Johnson 1977; 
Losch 1954; Skinner 1977; Smith 1976), gravity models (Crumley 1979; Haggett 1965; 
Johnson 1977; Olsson 1970; Plog 1976), optimal location models (Wood 1978), and 
polythetic-satisticer models (Williams et al. 1973). Some of these, such as central 
place or optimal location models, have a long history in the field of human 
geography and have only recently been adapted tor use with nonindustrialized 
societies by archaeologists and anthropologists. Others, such as the polythetic- 
satisticer model, have been developed by archaeologists on the basis of ethno- 
graphic research and basic principles of human behavior. 


Much like deductively based associational and areal models, deductive point- 
predictive models have been overwhelmingly ignored in cultural resource studies. 
Many of the reasons for this situation cited in the previous sections also hold true at 
the point-specific level. These models are more difficult to develop than correla- 
tional models, and the validation process is more involved. In addition, the accuracy 
of these models 1s usually not very high. For instance, in archaeology central place 
models are generally used more as a yardstic to evaluate deviations from a 
theoretical pattern thaa as a predictor of actual site location. 


The land manager reading this section may well have decided that, given the 
inherent difficulties associated with the use of deductive models, the current 
emphasis on pattern recognition represents a conscious decision on the part of 
archaeologists. This 1s a talse impression. Outside the confines of cultural resource 
management, pattern-recognition models have been much less discussed or devel- 
oped than their theoretically based counterparts. The reason for this disparity goes 
beyond any simple explanation of academic vs nonacademic research goals. What 
appears to have happened 1s that a perception has developed among landholding 
agencies that locations of archaeological sites can be predicted within acceptable 
accuracy levels. This perception was probably fostered by a number of theoretical 
studies, sponsored at least in part by the BLM and the Forest Service, that 
investigated the potential of pattern-recognition approaches to predicting site 
location (e.g., Cordell and Green 1983; Grady 1980; Hurlbett 1977; Kvamme 1983). 


The net result has been a tremendous emphasis on the methodological issues 
involved in prediction at the expense of studies of behavioral processes. The 
implications ot this trend can be illustrated with a simple example. Let us suppose 
that on the basis of environmental attributes 70 percent of all site locations in a 
region could potentially be predicted. Let us further suppose that an associational 
model was developed that predicted 50 percent of the site locations. By creating an 
areal- based discriminant function model the result might be to increase the model's 
predictive capability to 60 percent; with a point-specitic logistic regression model, 
to 65 percent; and with a point-specific quadratic discriminant model, to 67 percent. 
The point is that the increase in the sophistication of the statistical models has not 
led to a proportional increase in our ability to predict site locations. In this case, as 
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with most of the research in predictive modeling in the past few years, all of the 
effort has been devoted to finding ways of increasing our predictive power through 
statistical methods. It is not surprising, then, that as more and more research has 
gone into predictive modeling this research has yielded smaller and smaller 
increases in predictive power. Because patterns of environmental attributes will 
only account for so much of the variation in settiement patterns, no matter how 
much time and money are invested in developing statistical methods or sampling 
designs, at some level a point of diminishing returns 1s reached. That point is rapidly 
approaching in predictive modeling. 


A legitimate question for a land manager to ask would be, “Is the additional 30 
percent worth it?” There is no simple answer to this question, although an example 
from anthropology may be useful. In a study of political systems in highland Burma, 
Edmund Leach (1954) began with an analysis of the ecological situation. He argued 
that the distribution of two economic systems covaried fairly well with differences 
in environmental settings, but that once the environmental correlates had been 
factored out, a number of differences between systems were still left unexplained. 
Leach used his ecological analy sis as a springboard into a more detailed study of the 
social structure. The result was a far-reaching (and now classic) analysis of political 
and social dynamics embedded in a culture, aresult that simply could not have been 
obtained through the study of ecological relationships alone. 





As Chapter 4 will make clear, much of what is considered important about the 
study of archaeological remains is part and parcel of the percentage for which 
pattern-recognition models cannot account. Although these models are useful and 
informative in certain contexts, it is also true that no matter how the term ts 
defined, much of what archaeologists consider to be “‘significant’’ begins where 
pattern recognition leaves off. 





THE MODEL-BUILDING PROCESS 





It should be clear from the toregoing discussion that there are many kinds of 
predictive models of site location. Some are largely or wholly operationalized, others 
are intuitive; some are based on deductive arguments, others are inductive. 
Numerous modeling techniques exist, and the choice of a technique depends on 
research objectives and the available data base. Moreover, predictive models are not 
mutually exclusive. As archaeologists have learned over the past decade, the line 
between induction and deduction is neither hard nor fast. There 1s no reason why 
different modeling techniques cannot be used to analyze the same data, and in fact, 
there is good reason to do just this. 


Regardless of the form of a model or of the specific techniques used, the basic 
steps in the modeling process are the same for all models (Figure 3.1). The rest of 
this chapter will be devoted to outlining this process; Chapters 6-8 will discuss this 
process in much greater detail. 
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Figure 3.1. The model-building process 
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Identitication of Objectives 


As Figure 3.1 shows, the first step in the modeling process 1s the specification of 
goals or objectives. In the process of identifying objectives a clear distinction needs 
to be maintained between short-term and iong-term goals. In the long run, 
management and research goals are probably not that diffe-ent; cultural resources 
are protected tor what they can tell us about the past and how the past evolved into 
the present. It is the information content of the resources, sot their physical 
make-up, that has been deemed worthy of preservation. To best fulfill this legal 
obligation, tederal and state agencies need to know not only where resources are 
located but also why they are located there. This objective 1s our end goal. It 1s not 
at all clear that we can ever reach it. (ait as scientists we are committed to continue 
striving for it. 


In order to reach this goal, we need to have a better understanding of the 
necessary intermediate steps or short-term goals. Often developers of Class I and 
Class I] models reter to their results as “preliminary predictive models,” which 
suggests that they view these models as intermediate steps along the way to a 
better understanding of site location. Perhaps the most significant criticism that can 
be made about predictive modeling programs in most cultural resource manage- 
ment contexts is that there 1s no consensus as to the overall objective of these 
programs. Models continue to be developed as if they represented the desired end 
product. Instead ot calling tor the refinements of existing models, scopes of work 
usually require the creation of a new model. The results are not cumulative, and 
thus itis little wonder that most federally sponsored predictive modeling programs 
are bogged down in a seemingly endless progression of virtually identical models. 


From the perspective of the land management agencies, 1t would be prudent 
toidentity both long-term goals and the steps needed to achieve them. On the basis 
ot this overall plan, an agency could decide whether it would be more productive to 
award a contract tor an overview that requires the creation of a multivariate model 
ot site location or whether it would be more usetul to invest that effort in research 
designed to develop locational variables that make sense from a theoretical 
st indpoint 


Data ( ollection 


he tirst step in modeling locational behavior for a specific region 1s to amass 
the available data. Four basic sources of data are commonly used: historical docu- 
ments, ethnographic research, archaeological data, and environmental data. 


Historical documents include explorers’ and colonial accounts of Native Amer- 
ican culture and associated settiement patterns. Land-use records are sometimes 
available, as are baptism and death records for Spanish missions. The latter are 
especially usetul tor examining such issues as intergroup movement, population 
change, and ethnohistoric settlement patterns (e.g., Munoz 1982). Many of these 
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records have been examined by ethnohistorians, and secondary sources exist for 
nearly every region of the United States. 


Ethnographic research represents acomplementary data source. Ethnographic 
analogy of one form or another has been a mainstay of archaevlogical interpretation 
since the inception of the discipline. Ethnographic analyses of indigenous subsist- 
ence and settlement systems were used by archaeologists as the basis for 
settlement-pattern studies long before cultural resource predictive modeling 
became an issue. Perhaps the best known study of this type is Julian Steward’s 
(1938) Basn-Plateau Aboriginal Sociopolituw.al Groups, which served as the foundation for 
numerous settlement and subsistence models both within and outside the Great 
Basin (Flannery and Coe 1968; Jennings 1957; MacNeish 1964; Thomas 1972, 1973; 
Wilhams et al. 1973). In addition to direct analogy, ethnographic studies are useful 
as sources for general propositions about settlement decision behavior (¢.g., Jochim 
1976; Lee and DeVore 1968; Yellen 1977). Finally, the growing field of ethno- 
archaeology continues to supply much-needed data on factors and constraints 
leading to decisions about where people live as well as on depositional and post- 
depositional processes that affect the archaeological record (Ascher 1962; Binford 
1976, 1978a, 1978b, 1979, 1980, 1981; Coles 1973; Gould 1978, 1980; Kramer 1979). 


Recorded archaeological data exist in a variety of forms. Site records are stored 
at the state level, ether in a central repository or dispersed among several state 
institutions (usually museums and universities). Several federal agencies keep their 
own records, which may or may not be duplicated at the state repository. Regional 
data bases, such as the Southwestern Anthropological Research Group (SARG; 
Euler and Gumerman 1978) and Intermountain Antiquities Computer System 
(IMAC; University of Utah et al. 1982), exist for some areas. Private institutions, 
museums, and local historical and archaeological societies also may have informa- 
tion. Finally, as has been true since the beginning of archaeological research, one of 
the best sources for site locational information 1s the local informant. 


Extant archacological data vary considerably in quality and quantity. In order 
to assess the existing data one must evaluate a number of factors. The number and 
intensity of surveys has a direct bearing on the distribution of known sites and the 
types of sites recorded. Definitional criteria for sites are often subjective and 
nonreplicable. The reliability and comparability of recorded information ts an open 
question that must be resolved before this information can be used in model 
building (see Chapter 7). 


Environmental data can be gathered at two levels. At a macro or regional level, 
data can be collected on a variety of topics, including climate, vegetation, geology, 
hydrology, and physiography. Sources of these types of data include many federal 
and state agencies, such as the Soil Conservation Service, the Forest Service, the 
U.S. Geological Survey, the Fish and Wildlife Service, the National Oceanic and 
Atmospheric Administration, and the Bureau of Land Management. An increas- 
ingly important source of data on environmental conditions 1s aerial imagery. 
Remote sensing and Landsat images have emerged as extremely useful tools for 
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identifying and classifying environmental dimensions and as means of objectively 
measuring environmental variables (see Chapter 9). 


At the site level we are often interested in which cavironmental features 
affected the decision to settle in a particular spot. Studies of this nature are classified 
under the rubric of catchment analysis (e.g., Higgs and Vita-Finzi 1972; Jarman et 
al. 1972; Roper 1979; Vita-Finzi 1969, 1978; Vita-Finzi and Higgs 1970). Enviroumen- 
tal zones that surround each site can be analyzed in terms of their potential 
economic value. Studies of this type, coupled with environmental and subsistence 
data from excavated sites (e.g., pollen, flora, iauna, and malacological analyses), can 
help to shape our understanding of the subsistence strategy. 


Data Synthesis and Evaluation 


Once the available data have been gathered, they must be synthesized and 
evaluated in terms of their applicability for predicting site location. One of the first 
tasks 1s to identify general trends of cultural change and stabilicy and trends in the 
distribution of known sites. Map interpolation techniques, such as trend surface 
analysis, kriging, etc., can often be useful aids in discerning general trends. 


One result of this type of background research must be the identification of 
known sites or at least of the types of sites crucial to understanding regional 
settlement systems. Here interest lies in determining the effects of what some 
authors call the “big site” phenomenon (Rogge and Lincoln 1984) and what will be 
called “‘magnet”’ sites in Chapter 6. Implied in the notion of a magnet site is the 
existence of social factors that led people to locate other types of sites closer to or 
farther from a particular site than would be expected just on the basis of the 
prevailing subsistence system. Unless the exact locations of these magnet sites are 
known, it is extremely doubtful that site locations can be successfully predicted in 
that region. 


In the Santa Cruz River Valley of southern Arizona, for example, a predictive 
model was developed on the basis of a Class I overview (Westfall 1979). A Class II 
sample survey demonstrated that the Class I model overestimated the importance 
of certain environmental zones and therefore was not particularly useful. A second 
predictive model, which was based on environmental variables derived from work 
in the Gila Bend area about 80 km (50 mi) to the west (McCarthy 1982), was also 
tested against the Class II results and again was found not to be a very accurate 
predictor of site location. An intensive Class II] survey revealed the problem; three 
mayor Hohokam communities were identified in environmental contexts that did 
not contain such communities in the Gila Bend area. Each community consisted of a 
central platform-mound complex surrounded by smaller sites lying within 1.5-5km 
of the central complex (Rogge and Lincoln 1984). Only a small proportion of sites 
were found outside these communities. 


In most areas of the country the proportion of known, large, complex sites is 
higher than the corresponding proportion of known sites in other categories. People 
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have been drawn to large sites, especially those that exhibit major architectural 
features or mounds, since the nineteenth century. Many of these sites, which 
probably represent social centers and or the top elements of the regional settie- 
ment hierarchies, have been formally recorded or are at least known to local 
residents. The point 1s that in areas where socially complex societnes developed, 
predictive models based solely on environmental variables are bound to fail. Yet, in 
most areas the locations of many of the magnet sites are known and can be 
determined either by examining the existing site records or asking !ocal informants. 
Thus, the existence and importance of these sites can be evaluated at a1 early stage 
in the modeling process (say a Class I level). 'f this were accomplished, the 
construction of useful social predictive variables should be possible. 


This discussion of magnet sites points out the importance of being able to 
distinguish site classes. Ideally, site classes are defined along two dimensions, time 
and function. In practice, however, this task is often difficult even with excavation- 
based data, to say nothing of the problems involved in using site files or even 
survey-based data. At the data-evaluation stage it 1s important to determine (or 
hypothesize) the types of sites expected to be found for each culture period and 
their probable locations. The magnitude of the discrepancy between theory and 
existing data can then be gauged. That is, we can determine how many sites can be 
classified by period and function, with the remaining sites grouped into a residual 
category. Examination of the residual category, which in many areas of the western 
United States will constitute between 60 and 80 percent of all recorded sites, will 
determine the types of research questions that can legitimately be asked. These 
questions in turn will affect the type of dependent locational variables that can be 
modeled and thus the nature of the independent variables that can be used. 


Identification of environmental dimensions along which site locations vary is 
an important step. It 1s, however, only one step. Most predictive models developed 
in cultural resource management contexts have viewed this step as the on/y one or at 
least the most important one, paying lip service to other factors affecting site 
location. It 1s also important to bear in mind that the environmental variables that 
directly covary with site location are probably best viewed as proxies for whatever 
decision-making criteria led to the selection of locations exhibiting this environ- 
mental feature (Kohler and Parker 1986). For example, landform may be a proxy for 
considerations of defense, agricultural potential, floral resources, or any other 
reason that a group may have for choosing a place to live or to conduct activities. It 
follows that several environmental variables may reflect the same decision-making 
criterion or that one environmental variable may be an indicator 0. portions of two 
or more decision criteria. Moreover, the criteria for choosing a site location were 
probably different in different parts of a single settlement system, and certainly 
these criteria changed through time and between settlement systems. 


Given this situation, it would be best to study the covariation of environmen- 
tal features with each separate site class. This ideal situation is rarely realized 
because of the problems of distinguishing site classes, but it 1s still possible to model 
expected distributions of sites based on theoretical principles or ethnographic cases 
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and evaluate the results against the known data base. Using ethnographic inform:- 
tion about Great Basin settlement systems, Thomas (1972, 1973) wrote a computer 
simulation that projected the expected distribution of cultural remains across 
environmental zones and then tested these predictions against the archaeological 
record. This approach offers a way of evaluating the effects of environmental 
attributes on the settlement system that could be a powerful complement to the 
pattern-recognition studies in vogue today. 


In addition to examining environmental factors that affect decisions about 
where to settle, we need to evaluate the natural processes that affect the creation 
and present state of the archaeological record. Archaeologists have become increas- 
ingly sensitive to the difference between the systemic context in which residues of 
past behavior are deposited into the archaeological record and the archaeological 
context in which they are recovered (Ammerman et al. 1978; Binford 1976, 1978b, 
1979, 1980; Ebert et al. 1984; Schiffer 1968, 1976; Schiffer and Rathje 1973; see also 
Chapter 4 of this volume). In general, this growing awareness has not been 
incorporated into predictive models, probably because of our poor understanding of 
these processes and of the attendant difficulties in modeling them. Failure to take 
into account depositional and postdepositional processes leads to predictive models 
that, at best, predict where sites have been seen and not necessarily where they are 
or were. 


Several recent studies indicate the potential for increasing the power of 
predictive models by including geomorphic factors. For example, Artz and Reid 
(1983) use a relatively simple soil-geomorphic model to predict the location of 
buried Archaic sites in the Little Caney River Basin of northeastern Oklahoma. 
Previous surface surveys had not found any Archaic materials in the area, leading 
some investigators to question whether the region had been occupied during this 
period. Artz and Reid developed a model based on the proposition that the relative 
age and stability of a geomorphic surface is often reflected by the properties of the 
soil developed below it. The model was used to identify buried surfaces that in the 
past were suitable for habitation. Subsequent investigation of these surfaces showed 
that Archaic sites, although buried, were indeed located in the Little Caney River 
Basin. 


Model Components— Dependent Variables 


To develop a model one has to be clear about exactly what it 1s that 1s being 
modeled. As far as site location is concerned there are a vanety of potential 
dependent variables. It is possible to predict site presence or absence, site density, 
site types, site functions, or various combinations thereof. Moreover, the depend- 
ent variable can change, although this will require either drastic internai revisions 
or an entirely new model. For example, at an early stage of research archaeologists 
might predict that sites will be found in greater numbers in areas within 100 m of 
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permanent water and on land with slopes with less than a5 percent grade. Formally 
this relationship might be expressed as 


PLABOC\)>P(A 
where P(.4) stands for the probability that an area contains a site, B tor areas within 
100 m of permanent water, and for land with slopes cf less than 5 percent grade. 


Thus, the equation simply states that the probability that an area contains a site 1s 
greater if 1 meets conditions B and C than 11 is tor ali areas in general. 


At a later stage of research it may be found that the relationship between site 
location and the two independent varabl +s 1s much more precise. This relationship 
might be modeled with a linear equation of the torm 


4=D+F\B+Fy +! 


where 4 equals site density; B 1s distance to water ir. meters; C 1s slope in degrees; F 
and F4 are the weights for B anc C, respectively; D 1s a constant; and E 1s an error 
term. In this case two independent vanables are being used to predict the number 
of sites per survey unit. While the two equations represent two fundamentally 
different models, it 1s also fair to say that they are part of the same model-building 
process, with the latter equation being a more retined expression of the tormer. 


Ideally the dependent vanable should be specitied first, tollov ed by creation of 


the model. Usuaily in predictive modeling, however, a dependent vanable 1s 
selected on the basis of the data available and the types of independent vanabdles 


being used. Most archacologists tend to be less concerned with the exact nature of 


the dependent variable (as long as 1t bears on some aspect of site location) than with 
meeting the assumptions of the modeling procedure, especially in a mathematical 


model. 


im gare, we want to proceed trom crude measures of site location, such as 
relative cree) (4.¢., more sites here than there), to more powertul variables that 


will predict a ~~ cific site type in a particular location. Although the level of 


locational speciticity modeled 1s directly related to the nature ot the data that can be 
used to test it, it 1s necessary to guard against blind acceptance ot a dependent 
variable simply because a particular modeling technique 1s used. Deciding to 
predict site density because “that’s what multiple linear regression predicts” 
definitely putting the cart before the horse. Selection of an appropriate dependent 
variable has to do with defining management and of research objectives as well as 
eatifying the nature of the data base available or beng collected. Once this 
decision has been made, an appropriate way to model the phenomenon can be 
found. 


Model Components —Independent Variables 


Selecting independent vanables and determining their interrelationships are 
perhaps the most difficult steps in the model-building process. There are no rules 
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that govern this process and tew guidelines that can be offered. Vanables and their 
relationships can be derived from inspiration, intuition, creative thought, and or 
previous expenence. Certainly it 1s true that if one has a good grasp of general 
anthropological or sociological propositions about the factors that affect decisions as 
to where behaviors will be conducted, one 1s more likely to make an intormed choice 
ot variables. There 1s no guarantee, however, and Clark’s (1982:232-234) discussion 
of false starts and mental gestation pernods aptly describes this process. 


The development of model components and the definition of their interrela- 
tionships should be the areas in which archaeologists make their greatest contnibu- 
tion to the predictive modeling process. This, however, has not been the case. 
Instead, there has beer a tendency among archaeologists producing predictive 
models to concentrate on tie sophisticated multivariate mathematical techniques 
and to give only casual attention to the predictive variables. In most cases, 
methodological discussions focus on the inner workings of the statistical procedures 
with only passing references to the reasons why specific variables were chosen or to 
how these variables are theoretically related to site location. Indeed it appears that 
investigators are assuming that the relationship(s) between the environment and 
site location cannot be specified, other than that there 1s one, and that if only 
enough environmental variables are put into the equations something usetul will 
come out. 


There is nothing wrong with searching for patterns, but it 1s important to 
realize that the ways in which aspects of the environment are conceptualized and 
measured seriously attect the types of statistical tests that can be used as well as how 
they are interpreted. Since most archceologists are more atuned to the relationship 
bet ween site locations and the surrounding environment than they are to statistical 
theory, it stands to reason that it 1s on this area of specitication of locational environ- 
mental relationships that archaeologists could make important in-roads. 


In aa ideal setting a predictive model would be built by first identifying the 
characteristic of site location, such as site density or frequency (1.¢., the dependent 
variable) and then identifying all the social, environmental, and geomorphic factors 
(1.e., the independent variables) that empinge upon it. One can envision a series of 
differential equations descnbing the relationships among the various factors. In 
order to learn whether a site would be found at a particular location one would 
simply assign appropriate values to the variables in the equations, and “presto!” the 
answer would appear. Unfortunately, at this time such a model cannot be created. 
While it might be possible to incorporate all three factors into one model, the result 
would be extremely complex, difficult to evaluate, and probably would have very 
low predictive power. 


Perhaps the best approach fo. now 1s to develop a series of models. For 
instance, it might be hypothesized that settlement in a specific river valley followed 
some process that can be modeled with a speciix probability distribution. The 
importance of specific environmental variables might be assessed through the use of 
a pattern-recognition technique. Finally, a model of paleo land surfaces that would 
have been suitable tor habitation could be constructed using formation about 
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geomorphic processes. The results of the models would be mutually reintorcing. It 
one mode! worked better than cnother in a particular area, this intormation could be 
used to refine the model and eventually would yield a better understanding ot the 
settlement process. 


Regardless of whether one or several models are developed, the form ot each 
mode! will be the same. In each case a dependent v2 ~able will be predicted by one or 
more independent variables. Some models m archacology consist of logical state- 
ments (such as “ut. . . then”’) that connect che msdependent variables in some type 
of causal or deterministic fashion. These models are useful when theoretical reasons 
can be posited tor the connections. Often, however, archaeologists cannot be this 
specific, and in these cases there are two advantages to using a mathematical 
model: the relationships between the variables are explicit, and the v anables must 
be objectively detined and measured, a teature often lacking in the logica! models. 


The major disadvantage oi mathematical models 1s that each model comes with 
its own set of underlying assumptions. For instance, most of the statistical tech- 
migues used in predictive modeling assume a linear relationship between the 
variables. Theoretically, there 1s no reason to believe that the relatvonship between 
site location and the environment is linear any more than it 1s quadratic or any other 
function. While the goal 1s to work toward theoretically detined connections 
between variables, a start must be made somewhere, and 1: 1s pertectly reasonable 
to begin this process by using predetined relationships between variables as long as 
it is understood that these relationships are arbitrary. 


Once a specific modeling technique ts chosen the necessary data to develop the 
mode! must be gathered. For some types of models the data may already be on 
hand. Associational models can be developed on whatever data exist. The minimal 
restrictions imposed by these models and the ease with which they can be devel- 
oped probably account tor their populanty in overview-level research. 


Other types of models will require the collection of new data or the retormat- 
ting of existing data. For example, once it 1s decided to model site density per 
kilomet« ~ (4) on the basis of slope (B) and distance to water (C) using a linear 
equation of the form 


A=D+F Bek +k 


information must be collected on 4, B, and C so that the wemghts (Fy and F4), 
constant (D), and error term (F) can be defined. 


The decision as to whether to use existing data or to collect new data to 
develop the model will depend on the tollowing crtena. Are the temporal and 
functional site classes that can be defined with existing data sufficient tor the model? 
Are the environmental data that can be obtained trom existing maps or site forms 
suitable tor the proposed swedel? In particular, can patterns in microenvironmental 
variability be identified from existing records and does the distnbution of known 
sites by environmental zone reflect aboriginal settlement decisions or 1s it skewed 
by postdepositional processes? Finally, since many predictive models generalize 
trom a sample, can the existing data be considered in any sense to be representative 
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of the phenomena of interest? To answer these questrons, inte: mation must be 
gathered about the size and distnbutnon of previous surve,s as well as their 
intensities. Using this information, the researcher can detcrmine whether survey 
results are comparable, if all environmental zones have been adequately covered, 
and .f the types of sites found within the surveved areas are representative of the 
settlement system as a whole. 


Based on these criteria, gaps m the existing data base can be discerned. in 
order tor the model to be successtul, data on paleoenvironmental and geomorphic 
conditions, chronological and tunctional dimensions of site classes, and social and 
economi aspects of the subsistence and settlement systems must meet the 
requirements of the modeling techmque. The existing data base must also be 
assessed to determine how tar the datacan be generalized. F rom this evaluation, the 
researcher can determine what types of data, if any, must be collected im the tield 
betore model building can begin. 


Once gaps in the existing data are defined, a research program can be devel- 
oped to obtain the needed intormation. While it may seem obvious that research 
programs should be developed ic mect the needs of the partucula: situation, this has 
often not been the case. In the usual course of events the tirst mayor research project 
iN a Tegion 1s an overview, Combining a review of the existing data and a hterature 
search and producing a planning document (¢.g., BLM 1978). In essence, the 
primary goal of this overview 1s to decade how tuture work should be conducted. 


It would seem logocal that the sample surveys that generally torm the next step 
in these major research proyects should be based on the designs outlined im the 
overview documents. In practice, sample surveys tend to tollow mgid, almost 
standardized formats in which 10 percent of a management-detined area (often an 
aggregate or senes of aggregates of coal lease tracts) is sampled im 40- of 160-acre 
quadrats through the use of a simple or stratitied random sample see Berry 1984 tor 
a discussion of other problems with this approach). 


The uniformity of this design appears to be based on a desire to obtain 
consistent and comparable results. While the objectives are commendable, the 
approach 1s misguided. As will be discussed throughout this volume, the selection of 
sampling techmque, sampling fraction and sample size, and sample unit size and 
shape are decisions that cannot be made in the abstract but are dependent on the 
nature of the phenomena of interest and the research objectives. A 40-acre quadrat 
mav be an ideal sampling unit for estimating site density but a very poor chore tor 
studying intersite relationships. Moreover, consistent results have less to do with 
the sampling design than with issues of survey intensity, site visibthty, and sample 
unit accessibility (see Chapter 6). Indeed, the best approach to achreving substan- 
tive comparability between projects is not through design standardization but 
instead through design flexibility. 


The research design not only specities how the area will be searched tor sites 
but also how sites will be defined and recorded. Definition of sate classes will usually 
requore tairly intensive artifact analyses. “No collection” (or limited collection) 
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polices, while perhaps defensible from a preservation standpoint, run counter to 
modeling requirements. The present situation in which temporal and or functional 
site classes are only poorly developed 1s unlikely to change unless tensive artifact 
collections are made. Again, as with sampling design, decissioms regarding data 
recording are best made in relation to a specific project and not at an agency -wide 
level. 


Model Testing 


A central aspect of model development is model testing; in fact it can be argued 
that a model does not really exist until wt has been tested. Mods’ testing requires 
independent data. In general, archaeologists have rehed eather on collecting new 
data for testing or on splitting their sample in two, using one half to develop the 
model and the other half to test ut. The former tendency has led to many predictive 
models remamung untested or being tested only with the data used to derive them. 
The latter approach often results in such small samples that models can be neither 
tehably developed nor reliably tested. There are a number of statistical techmiques 
for validating models that crcumvent many of the problems described above; these 
techmques are discussed in Chapter 5 


In the validation stage it 1s necessary to examune not only the model mtselt but 
also the data upon which it 1s based. Double-blind tests, common im forestry and 
agriculture, are totally lacking in cultural resource management. Most agencies try 
to ensure that land 1s surveyed tor cultural resources only once. While the intent of 
this policy 1s understandable (after all, uf the entire land base can never be 
completely surveyed, why waste money on resurveying parts of it), it must be 
remembered that the intended use of a predictive model trom the agency's perspec- 
tive 1s to allow for useful planning and management decisions about cultural 
resources in a much larger area. Thus, the argument can be forwarded that, because 
the model 1s only as good as the data upon which it 1s based, tume and money spent 
ensuring the quality of the data are prudent and wise investments. 


Model Retinement 


Unless 100 percent predictive accuracy 1s achieved, a model can theoretically 
always be umproved by changing the variables and or respecitying the relationships 
among them. It us extremely unlikely that we will ever achneve the high level of 
predictive accuracy that would umply either complete understanding of past behav- 
vor or past behavior that was so determunistically patterned that it can be accurately 
predicted whether it 1s understood or not. 


The real question tor the land managing agency 1s “how accurate ts accurate 
enough?” The answer to this question depends on the agency and on the research 
objectives as well as the anticipated results. For instance, a first attenypt may 
explain 60 percent of the variance im site location and indicate mayor trends m 
settlement patterning. A researcher sm ght consider this result a tremendous 
success, while a land manager might v.ew it a5 a dismal failure. 
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Of note here are Dr. Prentice Thomas (New World Research), Dr. Christian Zier (Centennial 
Archaeology), Dr. John Hanson (Kabab National Forest), Gary Stumpt (Bureau of Land Manage- 
ment), Dan Martin (Bureau of Land Management), and Mane Cottrell (U.S. Army Corps of Eng:- 
neers). Finally, we gratefully acknowledge Lynne Sebastian and June-el Piper for their masterful 
editorial skills, and Jum Judge for ensurmng that this document reached its final form without 
substantive changes to the authors’ ideas or ideals. 
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Chapter 4 


THE THEORETICAL BASIS OF ARCHAEOLOGICAL 
PREDICTIVE MODELING AND A CONSIDERATION OF 
APPROPRIATE DATA-COLLECTION METHODS 


James I. Ebert and Timothy A. Kohler 


This chapter, intended for both managers and archacologists, discusses 
archacological predictive modeling from the theoretical and methodological stand- 
port. During discussions between the authors and editors of the volume and 
Bureau of Land Management and Forest Service archacologist. managers that took 
place before the book was written, it was suggested that the material contained 
hereim should be directed toward the cultural resource manager. The implication 
was that managers would not be interested in the sorts of things that archacologists 
often produce. This was to be a practical volume, a guide to how predictive 
modeling can be done and how ut should be used—not a compilation of esoteric 
anthropological theory. Some of those present seemed to be looking for a guide for 
the manager archacologists on how to do “pragmatic” predictive modeling that 
would cut research costs; others leaned more toward wanting a document that 
would question the propriety of using predictive modeling for purposes of assess- 
ment or mitigation. 


Both groups seemed to feel that locational predictive modeling had already 
been developed in useful form; the problems from their perspective lay in deciding 
how or whether modeling should be used. It 1s our feeling that we do nor know as 
much as we should about how to do predictive modeling at present; that it 1s a 
worthwhile goal to want to understand the process more thoroughly; and that 
through the proper combination of ngor and research we can probably learn to do 
such modeling in the near future. But at this stage in our understanding of the 
modeling process, it would be premature to attempt to produce a guidebook. 


In the two years since the original manuscripts for this volume were written, it 
has become even more apparent that many archacologists and cultural resource 
managers want and werd a guide to predictive modeling. With accelerating fre- 
quency, especially during © ¢ past year, we have received calls and letters from 
colleagues (some of whom are archacologists and some of whom are not) in the 
remote sensing and GIS fields who are contracting and experimenting with 
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archacologists who want to implement predictive models in their study areas. 
Those colleagues are invanably armed with third-or fourth-generation xerox copies 
of early drafts of certain chapters from this volume, chapters that purport to tell just 
how to do predictive modeling. After wading through the pro’s and con’s of vanous 
regression and sampling methods, they suddenly realize that the “modeling” 
advocated in those chapters has a surprisingly and perhaps dangerously simplistic 
foundation beneath all of the mathematical discussions. 


“Surely there was more to prehistoric human behavior than this imphes,” said 
one colleague, himself a Native American trained as an archacologist, remote 
sensing specialist, and geographic information system researcher. ** This 1s what we 
do to map fox or squirrel habitats: look for water and shelter and food and then 
draw polygons and isopleths around them. Squirrels don’t have canteens but 
Indians did. Do these archaeologists think they know all about how complex past 
peoples’ seasonal rounds were, why they went where they did?” 


The authors of this chapter feel, in fact, that we as archacologists do mor know all 
about the complex systemic behavior that must be the basis of archaeological 
predictive modeling. The theme of this chapter, then, 1s that while there may be 
more than one way to do predictive modeling once we know how to do it, as 
suggested elsewhere in this volume, there 1s only one way to learn how to doit. For 
those who contend that we already know how to do predictive modeling (“we doit 
all the tume”’), this could be rephrased to read that there 1s only one way toprore that 
we know how to do it. Developing predictive modeling as a tool to aid both 
archacologists and cultural resource managers must proceed from a consideration of 
just what it 1s that both of these groups want and need to know about. 


While some might teel, superficially, that archaeologists want to “explain” 
while managers just want to know where and what the resource 1s, we will illustrate 
that these goals are inseparable. Both must be approached from a theoretical 
standpoint — starting with the consideration of how we believe systems of human 
adaptation operated in the past and moving logically in the direction of evaluating 
how the ways we discover, collect, and analyze our data are compatible with 
learning what we need to know. 


Several reviewers of this chapter have protested that we are presenting “just 
one theory of predictive modeling” here. We would like to make it clear that the 
terms theory and theoretical are used here not in any partitive sense (“*. . . he has one 
theory and she has another. . ."’) but rather to indicate where one begins trying to 
build the framework of ideas and methods, and the hypothetical links between the 
two, that will be a prerequisite to being able to do predictive modeling, no matter 
what one means by that. This chapter, then, is about “The Theoretical Basis of 
Archacological Predictive Modeling,” as opposed to “ The Non- Theoretical Basis of 
Predictive Modeling.”* What, one might ask, could be meant by non-theoretical 
predictive modeling? Again, using theory to mean the framework by which ideas are 
evaluated, a non-theoretical approach would be one that begins with an attempt at 
the “‘unbiased” interpretation and derivation of knowledge from data, a direction 
that we will characterize in this chapter as empirwal predictive modeling. 








THEORETICAL BASIS AND DATA-COLLE( 


Empirical predictive modeling, im its simplest form, consist 
results of site survevs of an area and matching the bocatsons of sites wi 
landtorm icaiures OF ot fier mdacations of past ch. acteTistacs Of (he cnvironment 
Once these corresponcences are noied, the proposition 1s set forth that more sites 
will be tound m areas where the create st proportion of previously tound sites was 
located 


in more compicx manifestations, empinical predictive modeling breaks pre- 
viously tound sites into functsonal cr other assuned types, ocTives compics tzion- 
omues of environmenta: indicators, sometimes specifies multiple working Ny po- 
theses about the relationships between these two sets of variables, and apphes 


sophist x ated mathematx al modcls correlation and other 2550 tat»onal anaivscs to 


determine whoch sets of correlations are strongest. Then the same “prediction” ts 


made —that sites will be distributed mm unexamined areas the same way (‘that 1s. 
with respect to the same environmental indicators) that they were in the prev ust 
explored area 


In a sense, empirical predictive modcling often works —that 1s, correspond 
ence bet ween the presence of sites and of gross environmental indicators often cxrst 
at some level of statistical confidence. Mathematical confidence tests have nothing 
to do with explanatory confidence, however; they only test the probability of 
obtaining specific results by chance, given certain characteristics of the samples 
trom which data are drawn. It will be suggested in this chapter that the “success” of 
some empirical predictive models has as much to do with the ubiquity of the 
archacological record across the landscape, and with natural postdepositional proc- 
esses, as with the realities of the archacological record 


T has chapter will explore in de pth the differences between theoretical and 
empirical predictive modeling. “s ¢ begin with general properties of human adapta- 
tional systems as a first step mn an exploration of the processes that anthropological 
and ethnoarchacological research suggests are responsible tor the tormation of the 
archacological record. The complexities of human adaptational systems and their 
“translation” into the archacological record may make dithcult reading tor non- 
archacologists, but they are inescapable. In order to learn to apply empirical 
predictive modeling to the archacological record, one must “work back” through 
these complexities, which may be even more dithcult than our approach of “work 
ing forward” through them 


It wall also be suggested that one way to make this learning task — and future 
empirical predictive modeling, once we learn how —casier and more economical 1s 
to fit our data discovery and measurement methods to the things we want toknow 
about. In other words, we need to make our data-collection methods compatibk 
with our goal of explaining complex, multicomponent human systems. One mayor 
diflerence between present-day attempts at empirical predictive modeling and a 
theoretical approach 1s that empirical modeling has appropriate data biases 
already built in. The data upon whach ut 1s based have been cast on terms of sates with 
various assumed tunctions. It wall be sugge sted on thas < hapter that new methods of 


data collection, based instead upon our ideas about how the archacological record us 
} “ 
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formed and desgned to allow the evaluation of herent biases, may often be beiptul 
in the development of aa) workable predictive model, whether explanatory or 
empuanical. 

As was seen im Chapter 2, researchers have been expenmenting with empirical 
predictive modeling for many years and are contunuing to do so today. Most of the 
locatsonal predactsons made in archacology today are statements of empsical corre- 
lanon. True prediction of archacologscal distnbutons of matenals and of thew 
concomutant behavioral and natural causes 1s a worthy goal and one that 1s mmpor- 
tant and necessary for both the cultural resource manager and the archacologsst. 
Modeling and prediction are integral parts of the soentific explanatory process, as 
will be allustrated mm this chapter. They form a very real part of what archacologists 
must do to link their beefs about the operation and organization of past systems 
with the observable remams of the archacological record, and they constitute the 
only means by whach those behets can be tested. Cultural resource managers need 
to know where archacological materials are located, where they can be found by 
archacologists, and what these maternals are im order to preserve or otherwise 
manage them. 

The archacologist and the manager are united m thew attempt to arrive at 
successtul predictive models. There may occasmonally be talk of theory vs applica- 
tio”. of the research goals of the archacological sceentust berg at odds with the 
pragmats objectives and responsibilitees of the manager. But research cannot be 
separated trom such apphcations as attempting to predict the locations of archaco- 
logical materials. Research provides mformation about the basx operation of past 
human orgamizational systems; the discard of materials from these systems; the 
mcorporation of archacological materials mnto what us discovered and seen as the 
archacological record; and the ways am whach archacologiusts discover, measure, and 
imterpret this record. Without this mnformation there us no hope of understanding 
the mechanisms that create cultural resources. Prediction 1s not a rote empitical 
process: its scope encompasses the entire framework of archacological inquiry and 
explanation. Archacologists and managers are partners mm cultural resource man- 
agement and study. 

We conclude our introduction with a discussion of what this chapter us and 
what st 1s not. This chapter us different from the rest of the book: « presents sdeas 
about how the world works, about the structure of archacology and anthropology, 
about the organization of human systems, about the formation of the archacologycal 
record, and about how archacologusts perceive and use that record. Lookung at the 
task of locational prediction from this perspective tends to highhght the difficulty 
and mtricacy of the task, since st soon becomes apparent that a large number of 
complex considerations can affect the locations and even the degree ot predictabil- 
ity of archaeological materials. These are things that must be explored betore we 
can hope to predict successfully and oredict with understanding the locations of 
cultural resources. Although we present nethodological suggestions tor overcom- 
ing some of these difficulties, we risk bewg regarded as spoilers to the extent that 
we cannot at this cume offer casy fixes for all of the problems we can foresee m 
locational prediction. 
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Thus chapter 1s not an overnew of how people are currently proposing, or 
attempting, to do predictive modeling; these topacs are discussed mn other chapters. 
Instead of tocusing on the modeling process, thas chapter discusses some of the 
things that we need to think about (and some of the wavs im whach we might think 
about them) mm order to periect the process of prodacting whatever we decide to 
predict. To begin with, we attempt to define the places that modeling and 
prediction occupy within the explanatory framework of archacology — that 1s, what 
arc modeling and prediction? What do we waar (or what do we ered) to model and 
predact? The question of research goals 1s also addressed— what vill we have to 
learn wn order to be able to do these things? 


Methodological questions are very emportant im this discusson. The mterpre- 
tations that we make concerning the archacologaca! record are probably influenced 
as much by how archacologists deal with thew data as by what people actually did in 
the past. How can we collect the appropriate data? How can we ensure consistency 
and comparability in data collection, measurement, and analys:s within and 
betwcen surveys and other studies? Should of can every researcher have a umque 
research problem or onentation, or are there general problems upon which we must 


concentrate, problems of critical importance to the manager and the archacologist? 
And tinally (and perhaps most important from a management perspective), how can 
we ethoently collect data and do the other research that 1s necessary uf we are to 
learn how to predact charactenstics of the archacological record and how to give 
these characteristics meaning m terms of past behavior? 


These and many other topics are explored im this chapter. We begin by 
discussing the tramework of archaeological explanation within which modeling and 
prediction must take place. 


PREDICTION, MODELS, AND THE SCIENTIFIC 
FRAMEWORK OF ARCHAEOLOGY 


The archaeological record is a complex amalgam of patterning im matenal 
obyects created by the organization of peoples’ activities in the past and by the 


mtervenmg cultural and natural processes that have preserved or rearranged these 
materials since they were lost or abandoned by their past owners. The archacolog- 
cal record consists solely of patterns that we can see today —that 15, it 1s a contempor- 
ary phenomenon. It 1s important to note that these patterns do not ordinarily 
record a single moment frozen in time that, given the proper expertise, we should 
be able to reconstruct. In fact, the archaeological record 1s not ordinarily the sumple 
result of past episodes of individual behavior, and at 1s only through a scientific, 
explanatory archaeological framework that we can give it meaning. Nor 1s the 
archacologycal record a murror that reflects past behavior mm a dark, warped, and 
meoomplete tashion. This 1s only the case of what we want to do 1s to reconstruct in 
macroscopic (and normally umpossible) detail an mmstant view of the past. We would 
argue that this us not the goal of archacology. The nature and scale of the archaco- 
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logical record ts such that we wall be more successful m understanding a of we 
consider 11 not as the reflectson of actsons of undis sduals but rather as the cumulative 
record of an entire system. These systems are not directly embodied m nor are they 
equivalent to the maternals we tind m and on the ground. Linking past orgamiza- 
tional systeuns with the archacologscal record can only be accomplished through the 


explanatory tramework of archacology. The only distornons m this reasomng 
process will exst in archacologiusts’ models, not m the archacological record. 


Explanation in Archacology 


Explaining things mm archacology ss a two-way street, a progresmon of theory 
and method. Theory ss the way om which we thonk about things, partecularly about 


the existence, nature, and direction of cause-and-cflect relanonshups, and method ss 
the way in which we go about dealing wath data. These two parts of the explanatory 
process are inseparable, regardless of what the archacologist wants to expla. In the 
chart shown m Figure 4.1, some of the bnks between theory and method m 
archacologxcal explanation are shown. This diagram 1s intended more as a guide to 
how we might think about the explanatory process than as an indisputable flow 
chart of archacological thought, and many other categornes in the progresmon might 
be acknowledged. The pount us that explanation mmvolves both theory and method. 
In the diagram, oac might proceed m enher direction — trom ideas about human 
subsistence, settlement, mobility, and technological orgamzation (that ss, the 
ofgamization of systems) to mterpretation of patterning m the archacologial 
record, ot vice versa. Lonkong the two extremes of thes diagram constitutes explana- 
tron and requires the modeling of a senes of intervening processes. These processes 
transtorm the ways that people organized thei systems into what we see today as 
the archacological record. One class of these processes links statu: archacological 
data with the dynamics of past systems; the study of these has been reterred to as 
tormulation of “middle-range theory” (Buntord, ed. 1977:0-9). In our diagram, this 
class comprises discard behavior and depositional and postdepostional processes; m 
its broadest sense, middle-range theory provides guidelines tor generating emprn- 
cally talsstiable outcomes trom general theory. Other tactors that turther remove the 
patterning we see in the archaeological record trom past systema organization are 
those mtroduced by archaeological methodology itscli—the wavs m which 
archacologists recover, measure, analy ze, and interpret the archacological record. 


These things that separate high-range theory trom the meaning that we assign 
to patterned data represent comphcating factors im attempts to mterpret the 
archacological record. Moving trom one of these comphcating tactors to another 
requites qualitative rather than semply quantitative “translation” — that us, the 
physical archacological record lett behund after the action of cach of these tactors us 
of avery different nature than ot was betore. In the course of thes chapter, each of the 
components of the explanatory archacologwal tramework will be discussed. First, 
however, the place of modeling aad prediction the subyect of this volume — m the 
explanatory process must be addressed 
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Figure 4.1. The explanatory framework of archaeological science. Explanation is the process of modeling human 
subsistence, settlement, and mobility organization using archaeological and anthropological data, as well as anthropological, 
environmental, and systems theory, and confirming these models using prediction to derive expectations tor data 
patterning. These predictions must also be linked with higher-level theory through middle-range theoretical propos:tions 
concerning the things that separate the static archacological record from the organization of human systems. Empirical, 
inductive projection, sometimes referred to as “prediction” in the literature, 1s a methodology al exercise in which the 
results of future archaeological discovery are projected from noting correspondences bet ween where sites have been found 
previously and environmental or landform. features of assumed significance. 
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Modeling and Prediction 


In Figure 4.1, the lowest box, interpretation of data patterning, is connected 
with the highest theoretical category, subsistence and settlement mobility organi- 
zation, by the two-way process of explanation. Explanation involves integrating 
archaeological data with other sorts of information —ethnographic, ethnoarchaeo- 
logical, historical, environmental—to create models that connect the archaeological 
record with what we think was happening in the past. These models are abstract 
and complex formulations and can never be proved to be strictly “true.”’ In fact, 
this is not their purpose: they are constructs that help us to assign meaning, rather 
than laws or translational rules. Yet they need to be tested or confirmed if we are to 
know whether they are realistic and useful, and whether they elucidate the 
mechanisms behind how people live in their world. 


The way that models are tested is through prediction. Prediction is the 
formulation of hypotheses—that is, testable statemerts of expectations— based 
upon models. If predictions based on models are found to be successful, then the 
model and the theories upon which it is based tend to be confirmed. In the structure 
of scientific explanation, models and theories can never be proved to be true, but if 
the mechanisms behind the predicted phenomena are being modeled faithfully, the 
p-edictions based on them will be consistently successful. 


Successful prediction of phenomena in the real world is an accomplished fact in 
many scientific disciplines, such as electronics, chemistry, and physics. These 
successes consist of experiments in which predictions based on models are con- 
firmed in a wide variety of situations, with external influences being held “‘equal.” 
Such successes are unknown at present in archaeology. Not only are we unable to 
predict phenomena over a wide range of situations, but there is virtually no 
agreement as to what we want to predict and what we have to model in order to do 
thar. 


What Do We Want to Predict and What Do We Need to Model? 


The literature dealing with predictive modeling is usually directed toward 
determining the locations of archaeological materials, whether for discovery pur- 
poses (Artz and Reid 1983; Davis 1980a, 1980b; House and Ballenger 1976; Lynch 
1980; McManamon 198 1a, 1981b; Nance 1980, 1981; Spurling 1980; Warren 1979), for 
purposes of finding archaeological “voids” (Baker and Sessions 1979; Kemrer 1982; 
Kemrer, ed. 1982; Klesert 1983; Kvamme 1980, 1982, 1983a; Parker 1985; Peebles '983; 
Sabo and Waddell 1983; Scholtz 1980, 1981), or for more avowedly explanatory 
purposes (Chandler and Nickens 1983; Limp 1983; Nance et al. 1983; Waddell ! 43). 
Prediction of the locations of archaeological materials is a primary concern of 
cultural resource managers, as well, for in order to manage resources one must know 
where they are. It could be argued, and will be argued later in this chapter, that 
prediction of the locations of sites is an ambiguous goal, for the concept of the site is 
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of uneven usefulness when the ways in which archaeological matenals are depos- 
ited, accumulated. and discovered are taken into account. 


There may well be things other than simple locations, too, that archaeologists 
and managers might want to predict. Densities of materials, for example, might be 
of interest (Foley 1981c; Thomas 1973). The diversity or clustering of assemblage 
components at different sample unit sizes (Whallon 1973, 1974, 1984) or the occur- 
rence of patterning congruent with intrasite activity structure (Kintigh and 
Ammerman 1982) are other possibilities. The most obvious thing, or at least the first 
thing, that cultural resource managers need to predict, however, is the location of 
cultural resources. 


To make predictions we need to have models, and those models must span the 
entire explanatory framework rather than simply concentrating on those things we 
want to predict. Models exist at a theoretical level, not an empincal one. Their 
purpose is to elucidate the mechanisms behind the formation processes of the archaco- 
logical record, 1.e., to explain it. Prediction, then, is a subset of explanation. 
Whether predictions are to be locational or not, it 1s human organizational systems 
that must be modeled, as well as all those complicating factors between this highest 
level of human behavior and the archaeological record as we see and measure it. 


Cultural resource managers and archaeologists share the need for explanatory 
models. We do not yet have many satisfactory archaeological models or even 
components of such models. It will undoubtedly take many more years to decide 
what sorts of models are needed by both archaeologists and managers. Some of the 
things that we may need to consider in this decision process —those “complicating 
factors’’ referred to above —are discussed in the remainder of this chapter. 


THE NATURE AND ORGANIZATION OF HUMAN SYSTEMS: 
SETTLEMENT, MOBILITY, AND TECHNOLOGY 


A Systems Perspective on Prediction 


As discussed in Chapter 2, anthropologists interested in the relationships 
between people and their environment have increasingly adopted an ecosystemic 
perspective on these relationships. Over the past two decades archaeologists have 
also acquired the habit of referring to the dynamic interaction between people and 
the ecosystem as the settlement system without worrying too much about what it 
means, in general, to call something a system. (A notable exception is D. L. Clarke 
| 1968].) Yet our acceptance of this term has significant implications for our attempts 
to predict the locations of cultural resources. A system may be practically defined as 


acircumsenbed complex of relatively bounded phenomena, which, within these bounds, 
retains a relatively stationary pattern of structure in space or of sequential configuration 
in time im spite ot a high degree of variability in the details of distribution and 
interrelations among its constituent units of lower order [Werss 1973:40]. 
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This vague characterization can be sharpened by an exclusion. The mere fact 
that something 1s composed of several components does not necessarily make it a 
system; a distinction can be made between systems and mechanisms. In a mechanism 
such as a typewriter, for example, one action ngidly triggers other actions in a 
completely determined manner, corresponding to notions of strict linear cause and 
effect. In systems, however, there is much freer interplay between the components, 
despite considerable predictability in the actions of the system as a whole. Systems 
are not, however, composed of parts that are chaotic in their behavior. Living 
systems have an evolutionary tendency toward consolidation along stereotyped 
tracks and toward determinancy in the behavior of the parts; such systems ulti- 
mately realize some balance between flexibility (indeterminancy) and ngor (deter- 
munancy ) (Weiss 1973:54-59). Relatively mgid designs have great efficiency but are 
only successful if the problems to be solved are always the same. 


Another characteristic of living systems (for example, a human community of 
hunter-gatherers in its regional ecosystem) is that they tend to provide stability of 
existence for their components (individual bands or households, for example), 
although the state of any of these components at any time is itself unpredictable, 
varying far more than the state of the system of which the components are a part 
(Piaget 1978:59-72). This characteristic of systems leads in turn to a hierarchy of 
predictability that Weiss calls stratified determinism: there is predictability in the 
behavior of the system despite demonstrable indeterminism in the individual 


constituents of that system. 


We suggest that human settlement systems share many characteristics with 
general living systems. Settlement systems are the way that people move around on 
and locate themselves within a landscape. The individual constituents of this 
system —the locations of individuals or groups at any given moment, the ways that 
decisions are made or rationalized, the likes or preferences of human participants, 
and all the minute details that seem to constitute the everyday world when one is 
actually mrolred in a system—are inherently less predictable than are the structure 
and patterning of the system as a whole. 


This 1s not to say that any part of the operation of general systems or of human 
settlement mobility systems is random or, in the final analysis, indeterminate. The 
point is that scientific research addressing a research problem dealing with a system 
component must be targeted at the system to which the component belongs. Our 
job in spatial prediction, then, is to understand the structure of the system first. 
Accordingly, we will spend some time in this chapter discussing what the structure 
of a settlement system might include. 


In the course of this chapter we will argue that modeling undertaken for 
purposes of predicting the locations and characteristics of phenomena in the 
archaeological record should take place on the level of human organizational 
systems. In order to demonstrate this, we propose to take the reader on a journey 
through the many stages of archacological explanation, beginning with some 
approaches to modeling the nature of human settlement/mobility systems. 














THEORETICAL BASIS AND DATA-COLLECTION METHODS 


It will be umportant to remember that although individuals obviously make 
artifacts and other parts of the archaeological record, neither the patterning nor the 
role of these portions of the archaeological record in space and time can be equated 
with the actions (and even less the thoughts or decisions) of individuals or with 
specific episodes of behavior. Neither are the cultural materials we find today 
located where they are because of simple interactions between human behavior and 
specific resources or landscape variables. 


The patterning of materials in the archaeological record is a result of the 
organization of the cultural system that produced those materials. A cultural system 
is nct the summation of the actions of individuals but rather consists of the 
components in an organizational framework under which actions are structured; the 
patterning of cultural materials will embody aspects of this framework rather than 
provide any sort of instant view of a frozen ethnographic moment (Binford 1981). 


In cultural systems, people, things, and places are components im a field that consists of 
environmental and sociocultural subsystems, and the locus of cultural process 1s m the 
dynamu articulations of these subsystems [Binford 1965-205]. 


The actors in a cultural system are not only people, but places, artifacts, strategies, 
schedules, landscapes, climate, environment, resources—and many other things as 
well. 


One hallmark of contemporary attempts at archaeological prediction, and 
indeed of much modern archaeology in general, is the explicit or implicit assump- 
tion that environmental factors are major, even exclusive, determinants of much 
human behavior (site location, subsistence strategies, etc.). Environmental varia- 
bles, such as distance to water, distance to resources assumed to have been 
important, shelter, and available lookouts, are compared with the location of 
archaeological materials to determine whether there are correlations between these 
landscape characteristics and such cultural variables as the location of sites. The 
causal link between site locations and natural, independent variables 1s usually 
considered to be multivariate —that 1s, people positioned their sites with respect to 
an optimal combination of all the resources in which they were interested. 


Probably the best example of this approach is in Jochim (1976), often cited as 
one of the seminal works in archaeological prediction. Jochim argues that, since the 
distributions of individual resources seldom coincide, these resources exercise 
differential degrees of “pull” on settlements in relation to their value to the people 
who occupied those settlements. One problem with this approach is that it 1s based 
on a model of the individual person as decision-maker and of specific resources as 
the basis for making decisions about where to locate activities. That is, it attempts 
to predict specific components of the larger organizational system without regard to 
the system of which they are a part. This is not the level of human organization that 
must be addressed; it is the structure of human organizational systems within 
ecosystems that needs to be modeled in order to predict things about the compo- 
nents of human systems. How ecosystems variables relate to this task will be 
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considered later in this chapter; now, however, we will attempt to illustrate an 
approach to modeling human systems. 


Systemic Mobility ‘Settlement Organization 


Archacologists and cultural resource managers work with an archacological 
record produced by prehistoric human systems. All the “‘facts” that we know about 
these past systems are actually meanings that we have assigned to the archacolog:- 
cal record. Empincal correlative models that use distances between sites and 
resources as bases for predictions assume that simple proximity of one thing to 
another implies some sort of connection or causality, and thai distance negates 
these relationships. The assumption that proximity means something 1s of course 
supportable when one is observing ethnographic instants in time. It would be 
supportable in the interpretation of the archaeological record if we could be assured 
that we are observing therein instants in past time, 1.¢., a spatially and temporally 
nonoverlapping archaeological record. Not only have we no such assurances, but in 
fact it as almost certain that we are not. Many locations are used for short time 
periods within most human systems; resources may be transported great distances 
in anticipation of future needs; and many resources are not in constant demand. 
Prehistoric people, for instance, could certainly travel some distance without taking 
a drink, and they certainly had the mental resources to carry water with them. We 
should have as much capacity to realize (on another level) that the location of one 
component of a system — where an artifact is discarded, or where a camp is made —is 
affected by the patterning of other components in space and time: for instance, 
where another camp was made and what was there last week, or what a group 
anticipates it will find at the next camp. Rather than being due to the immediate 
proximity of the resources, in fact, archaeological site patterning is the result of 
long-term repetition (or lack thereof) in the “positioning of adaptive systems in 
geographic space” (Binford 1982:6), and the use of space is not uniform, even within 
the same system. Some activities occur at concentrated locations and some do not. 
The spatially concentrated nature of some activities and the dispersed nature of 
others have been discussed in terms of “ranges” of various types (Foley 1977, 1978, 
198 1b; Jochim 1976), settlements vs activity “nodes” (Isaac 1981:134), and catch- 
ments (Vita-Finzi and Higgs 1970). 


The very nature of human systems—organized through such tactics as plan- 
ning and anticipation and effected through caching, transport of materials, staged 
manufacture, and intensive reuse and recycling of material items — brings the use of 
proximity arguments in predictive modeling under question. Human behavior 1s 
different from animal behavior in that animals in general do not flexibly or con- 
sciously anticipate, plan, or transport, cache, and recycle materials; animals do not 
have behavioral systems organized in a human way. 


The things that people do that involve planning, anticipation, and the com- 
plex geographic repositioning of materials (some or most of which are not left where 
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they were used, or are reused there and other places in other times) will not be 
understandable in any simple way through correlations of artifacts or other cultural 
evidence and supposed nearby resources. Of course, unplanned events and activi- 
ties will be represented in the archacological record, for even in the most highly 
planned systems (and perhaps particularly in them) unanticipated contingencies 
will arise. These events, in fact, may be explainable through artifact-resource 
proximity arguments—these are the things that people do like animals, and the 
same sorts of predictive modeling that our previously mentioned colleague uses to 
model fox and squirrel habitats can be used to “predict” them. 


So there are aspects of both human organization and ‘‘animal” behavior 
embodied in the archaeological record—perhaps we do know how to do some 
archaeological predictive modeling after all! But before you skip the rest of this 
chapter and turn to discussions about the best regression models, we think that just 
a few very important questions must be asked, including “hat proportion of buman 
bebarior 1s immediate and unplanned ( and thus explainable using proximity arguments) and what 
proportion 1s systematically orgamtzed? Which portions of human bebarior are we most interested 
in? 


The nature of activities that happen at any place during an occupation will of 
course have a relationship to the resources available there, but this relationship may 
not be a simple one, and its strength will be affected by such environmental 
characteristics as the distribution or diversity of resources (Harpending and Davis 
1977) or the annual range of temperatures requiring, enabling, or restricting storage 
of foodstuffs (Binford 1980). But economic resources are not the only actors in 
human organizational systems, and they will not be the only determinants of where 
different activities are carried out in these systems. What a group does at one place, 
for instance, may be as much affected by what they will do at the next place they 
visit, or what they did at the last place they visited, as it 1s by the available resources 
at the current location. 


An examination of one taxonomy of differential mobility patterns will help to 
illustrate the interlocking nature of the parts of a human organizational system, as 
well as the implications of different forms of organization for the formation and 
ultimately the predictability of the archaeological record. Binford (1982) has distin- 
guished a number of ranges or mobility zones that can be used in different 
combinations to characterize the ways that people use the space around their 
residential base. A residential base is the place where a group lives, where resources 
are consumed, where children are reared, and where most maintenance activities 
take place. Residential camp sizes vary, mostly in relation to population sizes. 
There are certain complications in this relationship, however, that prevent direct 
projections of population on the basis of site size, as will be discussed later in this 
chapter. Surrounding the residential base is the foraging radius, which 1s usually 
considered to be within 10 km of the camp in any direction; resources in this zone 
are exploited in the course of trips that last a day or less and from which both 
resources and people return to the residential camp. This area contains /ocations, 
places where resources are extracted and where limited processing is carried out. 
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Few maimtenance activities are carned out at locations. Ouiside the foraging radius 
18 a logntecal radsus, whach 1s exploited by special-purpose task groups who stay away 
from the residential base for at least one night and sometimes for months. Within 
the logistical radius, both maintenance activities and special-purpose activities can 


and do take place. 


Not all groups use these different radu to the same extent. The use of these 
radu vanes with the frequency with which a group's residential base 1s moved, and 
this, in turn, 1s conditioned by environmental and perhaps in some cases social 
factors. In highly diverse environments almost all resources can be found within a 
group's foraging radius, and people in equatorial yungles and possibly in some other 
environments, such as the Kalahar: Desert and the southern parts of the North 
Amencan Great Basin, particularly during the summer months, acquire nearly all 


resources using a generalist 


encounter strategy during daily walkabouts. Intensive 


use of the foraging radius, however, leads to quick depletion of resources, and when 


this happens the residential camp 1s moved, most often to one edge of the old 


foraging radius. From this 


new basecamp a new foraging radius 1s established 


(Figure 4.2). Only half of this new radius ts actually usable for foraging, of course, 


since the portion shared wit 


h the old radius 1s still depleted. This sort of mobility 


strategy results in what Binford (1982:10) calls a Aalf-radius continuous pattern. 
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In more differentiated or simpler environments, a complete radius leapfrog paticra 
of residential mobility 1s more often found ethnographically. This settlement 
system consists of residential moves that result im little or no overlap between 
successive foraging radu but produce logistic radu that do overlap (Figure 4.3). In 
this situation, logistscal camps are often located at old residential bases because 
materials in these abandoned camps can be reused and because the specialized 
task-group members are familiar with the old residences and their surroundings— 
reasons for site location that are at least partly nonenvironmental. Examples of 
cultures with this type of settlement system include the northern Paiute and the 
Shoshone. A variation of the complete radius leapfrog pattern that 1s common in 
lower-biomass settings 1s the pornt-to-pomt pattern found in high-clevation settings 
and claimed to be used, for example, by the Yaghan of Tierra del Fuego (Wills 1980). 
In this case residential moves involve no overlap in use zones at all, not even in the 
logistic radu. The location of residential camps under this mobility pattern repre- 
sents a compromise among the locations of known but spatially incongruent 
resource distributions. These resources are then exploited through logistic mobility. 





a = 
t | Foraging Radws | | Residentio! Bases 

= @ Locations 

YY, Logistic Radius © Reoccupied Locations 


Figure 4.3. The complete radius leapfrog pattern of landscape use. This model was devised to typify the 
land-use strategy of logistically onented groups. Locations that are reused within the zones of logustic radius 
overiap could contain assemblages representing different functional uses. Archaeological materials found within 
the foraging radu would be dispersed and contmnuous, materials at locatvons within the logistic radius are more 
focused but may represent multiple functional occupations (after Bunford 1982). 
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Some Examples of Variability in Reuse of Places 


Binford’s mobility settlement type taxonomy, as described above, is not an 
attempt to arnve at any “whole truth” about human spatial organization; rather it 
is an attempt to model different types of organization so that their consequences in 
the archaeological record can be predicted. Binford’s mode! 1s not altogether 
theoretical in its derivation; rather, it 1s based on ethnographic examples gathered 
by anthropologists and ethnoarchacologists studying hunter-gatherers, pastoral- 
ists, and agncultura! groups throughout the world. Ethnographic and ethno- 
archacologica! accounts of variations in mobility and settlement patterning indicate 
that groups operating under different mobility settlement patterns exhibit differ- 
ent patterns of reuse of places. This observation has important implications for our 
understanding of the complexity of the archacological record. 


A number of expectations or predictions about the reuse of places can be drawn 
from Binford’s mobility model. Binford’s suggestion that under the complece- 
radius leapfrog pattern old residential bases will be reused for special-purpose 
logistic functions leads to the expectation that, under such a mobility organization, 
sites will occur at definite points within the landscape where different functions 
would overlap. In addition, since the location of residential bases represents a 
compromise among the locations of resources exploited through logistic mobility, 
we might also anticipate reuse of residential locations as residences, assuming stable 
distributions of logistically exploited resources. 


For the half-radius foraging pattern, on the other hand, there are no logistic 
camps and resources are more evenly distributed. Reuse of residential camps might 
be less common under this form of organization, in part because foraging radu would 
more likely be depleted of critical resources for some time and in part because of the 
nature of the environments in which foraging is most commonly practiced, as will be 
discussed below. Foraging radius locations—places where resources are encoun- 
tered and perhaps minimally processed —could be expected to occur almost ran- 
domly within the foraging radius, a pattern that through time would lead to a 
low-visibility but continuous archaeological record. 


Anthropologists and archaeologists have found that living hunter-gatherer 
and pastoralist groups that pursue a relatively generalist strategy and fall toward 
the foraging end of the mobility settlement scale utilize their foraging radu more or 
less continuously. Population densities among such groups are characteristically 
low. An annual average density of 0.03 persons per square kilometer has been 
recorded among the Kade area Bushmen (Harako 1978; Tanaka 1969), and even 
among the relatively densely populated Ituri Forest Pygmy a density of only 0.2 to 
0.6 persons per square kilometer is typical. Characteristically, such peoples exploit 
their sparsely populated ranges relatively evenly. 


Foley (1981c:21) cites very low densities of artifacts among such groups in 
Africa, even on residential bases if those bases were only occupied once. What is 
more, a large percentage of artifacts among such group: are discarded at what Foley 
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calls “secondary home range foci,” which are the equivalent of Binford’s “loca- 
tions” within the foraging radius. These locations are usually used only once, and 
thei occurrence throughout the environmentally diverse home range assures even 
distribution of discarded tems. Gould (1980) reports that among Australian abong:- 
nes only about | percent of lithic discard occurs at the resdential basecamp; most of 
the rest occurs within the home range (foraging radws). The results of evenly 
distributed, low-density discard over the length of tume monitored by ethnologists 
are almost invisible, but over archacologscal tume this discard process can produce 
impressive and relatively continuous densitees of discarded materials. 


Thus discussion has smportant imphcations for the ways in whach the archaco- 
logical record of foragers should be surveyed, measured, bounded, and analyzed, a 
topic to be discussed mm greater length mm later sections of this chapter. Given a 
foraging adaptation, it 1s clear that, mn much of the contemporary archacologic al 
record, discrete “sates” will not be apparent. Nonetheless, the continuous archaco- 
logical record left by groups employing a foraging strategy includes within ut 
materials related to both types of activity areas used by these groups (residential 
and nonresidential blocs). 


Although few human groups pursue a pure foraging subsistence strategy, most 
groups represented mn the archacological record may well have pursued a foraging 
strategy at least part of the tume. A model such as Binford’s, which contrasts two 
extreme subsistence and mobility settlement strategies — foraging and 
collecting —1s not meant to reflect the real world as much as to provide a basis for 
predictions. All actual human strategies should fall somewhere between these two 
extremes. Among groups that depend more heavily on logistically orgamized collect- 
ing strategies, there are definite nodes or foci in the landscape that are repetitively 
used for the same or different purposes. Even among near-classic foragers, such as 
the Bushmen described by Yellen (1976), some camps can be seen to be resett'ed 
even within the short span of ethnographic tume. 


Most North Amencan prehistoric and ethnohistoncally recorded hunters ond 
gatherers could be expected to employ subsistence strategies more closely resem- 
bling the collecting portion of Bintord’s model and thus to exhibit a logistic 
mobility settlement pattern. For example, most Shoshone groups of the Great 
Basin, who exploited only wild foods even ethnohistorically, occupied a number of 
functionally diflerentiated types of camps. Four mayor food sources were exploited: 
Indian ncegrass seeds, pinon nuts, jackrabbits, and antelope (Powell 1980). Winter 
villages served as residential bases, and foraging for seeds and rabbits took place 
near these camps; in addition at least two types of special-purpose camps were 
occupied. Pinon camps, which were reused when the nuts were locally available, 
were occupied by one or more tamilies for periods ranging from 2 weeks to several 
months. Antelope camps were also reused, although only about once every 12 years 
owing to pressure on antelope populations. When these antelope camps were in use, 
however, they were occupied by a large population consisting of many residential 
groups, and they were spatially quite extensive. The Shoshone antelope drive comp 
is a good example of a location being chosen not on the basis of “multivariate” 
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determinants but imstead because of the presence of a single resource. As Thomas 
(1983:79) notes, at antelope camps “the short-term gains] of hugh-bulk anal 


procurement temporarily offset the high costs of transporting essentials such as 
firewood and water.” 


The pastoralst Navajo also exhib differentiated use of locations withan a 
home range centering on a permanent camp. Some of these functionally specific 
locatsons are used only once or mirequently (temporary windbreaks, tent loca- 
tons), but many more are revisated regularly (¢.g., stock shelters, storage features, 
dumps, antelope hunting corrals, sweathouses; Kelley ct al. 1982). Although com- 
monly charactenzed as pastoralists, the Navajo also grow crops, and they mamta 
agricultural teldhouses when the distance from the permanent camp to the field 1s 
greater than ca. 3.2km (Russell 1978). Some of these ficldhouses are occupied for the 
entire agncultural season, and commonly they are reoccupied from year to year. 


As people become more mtensively agncultural and remdentially sedentary, 
thei logistxc use of nonresdential locations may actually be greater than that of 
hunter-gatherers. But because there us little resdential relocation, these special- 
purpose locations are used for more or less the same set of functions, although not 
necessarily all at the same tume. Among Pueblo agnculturalists, both lhwing and 
prehistoric, special-purpose sites have often been lumped under the rubnc of 
“fieldhouses,” although they may have had many functions, including agncultural 
camps, lookouts, hunters’ camps, and storage facilities (McAllister and Plog 1978; 
Moore 1978). Mesoamencan analogies suggest that small ficldhouse locations orgy- 
nally occupied for purposes of tending agricultural fields may grow mto larger 
residential villages through tume (Fish and Fish 1978). Ells (1978) observes that 
among the New Mexico Pueblos most fieldhouses belong to single individuals and 
thus represent recurring occupations for only a generation. She also notes that these 
structures are used not only while fields are being tended but also for “vacations.” 


Implications of Variations in Settlement Mobility Patterns 
for the Archaeological Record 


Bintord's model of hunter-gatherer subsistence strategies and their concomi- 
tant settlement mobility organizations has suggested two polar extremes, that of 
subsistence generalists with a foraging pattern of spatial use, and that of specialist 
collectors whose use of space 1s logistically organized. Ethnographic and ethno- 
archaeological documentation provides support for the conceptual validity of both 
of these patterns and also suggests that most groups occupy a position somewhere 
between these extremes. Prehistoric systems also can be expected to fall somewhere 
on this continuum—in other words, some aspects of their use of space will be 
continuous and other aspects will result in the reuse of places for the same or 
ditlerent functions. 


What are some of the implications of these patterns for the formation of the 
archaeological record, particularly with respect to predictive modeling? A first 
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obvious umphcatson ss that wuthen any sungie orgamzational system there should be a 
number of difierent ste types with functionally and formally different contents. 
The determinants of the placement of these diflerent sue types vary, with some site 
locations (those of resdential sates om logistic systems, for instance) being compro- 
muses among the locations of known resources (1.¢., determined and multiwariate). 
Some site types, for mstance remdential bases and locations in foraging systems, will 
be far less predictably located on the basis of correlanons with resource locations. 
Still another class of sites, special-use camps within loguta systems, may be 
locationally quite dependent upon the occurrence of a single resource and mde- 
penoent of the occurrence of other resources. In order to predict the locations of 
special-use sites, one would need to know just what resources were being exploned 
at and around them. The archacological dilemma about the function of “ficld- 
houses” dlustrates that a may be very difficult to determine the specific use of 
places by samply mspecting those sites. Nonetheless, all of the ste types that 
constitute a settlement mobility system are integral partacxpants m the overall 
orgamzation of that system, and they must be understood betore the locations of 
other components of that system can be predicted. Another umphcation of these 
patterns of space use and reuse ws that a large and emportant portien of the 
archacological record may be relatively continuous across the landscape, difficult to 
discover using current survey methods owimg to low density of discarded materials, 
and very hard to talk about m terms of any equivalency between perceived clusters 
of materials (sites) and past behavioral episodes. 


The reuse of places through tume also rases questions about the practice of 
equating clusters of materials with sites, at least mmsofar as sites are automatically 
imterpreted eprsodically and as having locations that are predictable on the bas of 
their proxumty to mmportant resources. Moreover, site suze as a functionally discrum- 
mating factor may be skewed by the reuse or lack of reuse of structures or of the 
places where previous structures had been. In the residential camps of the northern 
Ute, for example, menstruating women were required to build anew menstrual hut 
cach month; these were sumilar to famuly shelters un size and functsonal characteris- 
tees, having mnternal hearths and activity areas, and they did not occupy areas where 
previous menstrual huts had been burlt (Smith 1974). If a hypothetical northern Ute 
reudential camp were occupred by an extended tamily including exght adult 
females, half of whom were pregnant at any given time, approxmmately 68 new 
menstrual huts would be constructed each year. If each old hut structure remained 
visible tor 50 years, as some taphonomuc studies indicate might be possible, and if 
the camps were continuously occupied over these years, this single Ute camp would 
accrue 3400 menstrual hut locations. What would normally be classed as a very large 
site may actually be the remains of multsple reoccupations of a sengle location by a 
relatively small group. 

An exciting account by john Wesley Powell, an ethnographer who worked 
with the southern Numa (Ute) tor two decades begunning im the 1860s, Wustrates 


the consequences of reuse of the same general area, but not of the exact spots where 
structures had previously been built, at residential camps. 
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ts were care that 2 ate for acamp » accugerd a scoend tame and though they all go agam 
vear ater veat to camp ecat the same epremg ot emma!) stream thes oe arabs eck a ors 
ete ter thee bewouas cach tome Whee thes leave 2 camp thee beveuas ate not 
Gestreved and se on comm to 4 customary camper place of the Utes, a gece the 
appr atame of haveng been acupeed by a wets Large tebe, and pereems are candy ied to 
suppose that thousands have been on amped there when om tact perhaps a emall crude ot a 
daren tamibes have been the only persoms & ho have accugeed the grownd ter mam wears 
‘bew ter and Few ler 197! §)) 


The nature of ste patterning and the appearance and viseblty of archacolog)- 
cal sates are seldom determined solely by the actiwines carmed out durmng a single 
occupational epusode. The archacological record u instead created by the repetitive 
superposition of matenals resulting trom adpustments of human systems to thew 
landscape through mobulaty. All components of these systems must be located, 
studhed, and understood through the explanatory process betore aay can be success- 
tully predated. 


TECHNOLOGICAL STRATEGIES, DISCARD BEHAVIOR, AND 
THE ARCHAEOLOGICAL RECORD 


Analyzing the differences and wmuilanities among and within collections of 
cultural materials that are found at places — that 1s, assemblage vanabulity — 1s often 
thought of as something to be done m the future, after the cultural resource 
manager's work has ensured the protection of sagmificant sxtes. Untortunately , thes 
cannot be the case mn any program directed tow ard predictung the locations or other 
characteristecs of sites and resources. In order to understand the workings of past 
systems and the mechamwms behind the spatial orgamization of activities, we must 
be able to tell the parts of systems from one another. In this section we will suggest 
that the component parts of human systems can be identified on the basis of the 
tools and ot her materials discarded, combined with information about the orgamiza- 


tion of technology. 


Modeling Technological Organization 


Ongormng cultural systems occupy a set of functionally and spatially dideren- 
tiated places. If we study these places amply by grouping together sites that are 
similar, we cannot hope to understand the system as a coherent whole. In order to 
understand past systems we must find a way to group together the different parts of a 
angle cultural system or type of adaptation. Such parts of the cultural system may 
occur m the forrs of clumped distnburions of artifacts and feat ures resulting from a 
single ot from multiple occupations. Assemblages of artifacts resulting trom ditler- 
ent functronal activities and formed at the same or different tumes may overlap 
wholly ot partially m space. In other circumstances artifactual materials may be 
relatively sparsely and continuously scattered over large areas as a result of exten- 


sive foraging. 
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On the organizational level, it is clear that the archaeological record is not 
directly or sunply equivalent to activity areas or sets. It is accretional rather than 
episodic, whether it is of a continuous nature or concentrated into clusters. It 1s 
necessary to sort out the overlapping, accretional sets of artifacts and features 
before functions and roles in the organizational system can be assigned to what we 
see in the archaeological record and before we can approach any sort of locational 
predictions. A consideration of intersite and intrasite assemblage variability is a 
necessary starting point. 


Assemblage variability can be predicted through reference to the model of 
subsistence, settlement, and mobility detailed above. It should again be empha- 
sized that models are heuristic theoretical constructs that perm:t us to consider the 
range of strategies that human groups might follow and to predict the expected 
results of these strategies. Models allow prediction of consequences; if these 
predictions are confirmed, this tends to validate the usefulness of the model. 
Consequences are predicted from the model through the use of middle-range 
theory (Figure 4.1). 


Curated vs Expedient Technology 


As an example of a middle-range theoretical concept with great potential for 
tying together the dynamic organization of past human systems and the static 
contemporary archaeological record, consider the distinction between curated and 
expedient tools (Binford 1976, 1979). Expedient tools are those that are manufac- 
tured in the immediate context of their use when the circumstances that require 
them arise. Examples of expedient tools are rare in today’s manufactured technol- 
ogy, but we all use bent coathangers to open locked automobile doors or convement 
sticks to chase frightening dogs. In systemic terms, the use of expedient technology 
would be expected to be greatest in organizational systems geared toward an 
encounter strategy —that is, foraging systems. In the environments that favor such 
a strategy, there is an equal chance of coming across a wide variety of resources; 
there is no need for the participants in such a system to even attempt to predict 
what they will find. Other things (such as material availability) being equal, it might 
well be most efficient for these people to manufacture :oois on the spot to meet 
specific situations as they are encountered. 


In curated technologies, on the other hand, the tools that are employed are 
planned to fit specific uses that have been anticipated (Binford 1976). This is an 
efficient strategy in environments where the occurrence of resources is predictable, 
and in organizational systems that focus on specialized resources. Collecting strate- 
gies featuring a logistic organization of mobility —dispatching of special task groups 
to procure selected resources—are most likely to exhibit curated technologies. 


As in any modeling effort, of course, these two technological extremes are 
theoretical constructs. Actual technologies employed within a system can be 
expected to be a combination of the two. For instance, foraging people may produce 


:17 











EBERT AND KOHLER 


and use general-purpose curated tools in addition to manufacturing situational 
tools. It is probable that the participants in logistically organized systems will 
encounter unplanned situations that require the fabrication of expedient tools or 
the modification of tools with planned uses into tools with new uses. One character- 
istic of curated components of a technology is that they are often the result of staged 
manufacture employed in the face of time stress (Torrence 1983). Time stress occurs 
when resources are clumped or concentrated in space (which requires a focus on 
specific resources to consumer needs) and in time (which requires highly efficient, 
specialized tools). Since collecting resources in such an environment must be done 
in short time periods, there is plenty of time to work on tools; high energy 
expenditures in tool design, manufacture, and maintenance assume technological 
efficiency. Typically, tools are manufactured and maintained in a staged manner, 
with stages taking place not only at residences but also at special-purpose locations 
occupied on the way to and from locations of time-stressed resource procurement. 
Staged manufacture, resource specialization or focalization, and the use of special- 
purpose locations are characteristic of logistically organized groups. 


Foraging groups are characterized by relatively broad-spectrum resource 
bases—they are generalists in that they exploit a large number of resources, at 
relatively low levels, within a foraging radius even over short time periods. While 
specialists in simple environments must obtain most of their resources during very 
short time periods, this is not the case for foraging generalists, who obtain food 
slowly and constantly. In such a generalist scenario there is neither the need nor the 
opportunity for staged manufacture. If technological components are curated, they 
are manufactured, maintained, and discarded at residential bases. Expedient por- 
tions of a foraging group’s technology will be discarded continuously throughout 
the foraging radius. 


These crosscutting but definitely not independent middle-range dimenswns 
of variability —collecting vs foraging, resource specialization vs generalization, and 
tool curation vs expediency —are important in a discussion of predictive modeling 
in that they have different implications in terms of the location of the manufacture, 
maintenance, and discard of tools and hence the formation of the archaeological 
record. Expedient tools are manufactured where they are needed, and they are also 
discarded there. In this strategy, the occurrence of expedient tools is isomorphic 
with the activities for which they were used, and the energy put into these objects is 
low; they exhibit little in the way of formalization or style. Most expedient tools 
probably do not look much like tools at all and are therefore either exempted from 
analysis by many archaeologists as ““undiagnostic’’ or included in the category of 
debitage. 


Curated tools, on the other hand, are rarely either manufactured or discarded 
in the context of their immediate use. Tools intended for use during the mobile 
activities of special task groups are most likely to be manufactured at residential 
basecamps (Binford 1980) for anticipated uses away from those camps. Curated 
tools, designed to be used for some time, will be more durable than those made 
expediently for immediate discard, although this may not be morphologically 
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obvious. Characteristically, however, curated tools are of a compound or complex 
nature (Allchin 1966; Oswalt 1973), having hafted components or multiple parts. 
These characteristics heip to ensure that a curated tool will not be “‘used up” at the 
locus of its use but rather will be brought back to the residential base for reyuvena- 
tion or other maintenance. Under such a curated technology, both manufacturing 
debris and broken portions of curated tools will be found at the residential site and 
not at the places where the tools were used. The only curated tools (as opposed to 
**site furniture,’’ such as metates) that should be found at the place where they were 
used are those that were lost and not recovered, for instance, unrecovered projectile 
points. 

Expectations about the presence of discarded tools and debris associated with 
tool manufacture in the archaeological record can be generated from the above 
assumptions. Under a foraging strategy, there are two situations in which discard 
should take place: at the residential basecamp and at the location. Manufacture and 
discard of expedient tools would be expected to take place at both of these loci, with 
the implements being discarded where they were used. Groups using a foraging 
strategy should exhibit major variations in mobility and group size and composition 
during the year or from year to year in response to short-term variations in the 
environment (Binford 1980). This leads to the expectation that the activities 
performed at foraging sites of either type could be quite diverse and could vary with 
time. Since over the long term, at least, campsites would not be chosen with regard 
to the placement of previous camps or locations, this diverse archaeological record, 
particularly those assemblages derived from locations, would tend to be relatively 
continuous over the landscape, given long-term use. Under a foraging strategy, 
variability in residential site assemblages is the result of differences in seasonal 
scheduling of activities and in duration of occupation. In such systems there is a 
pattern of increasing assemblage diversity with increasing site size, as noted by 
Yellen (1976). Among groups practicing a foraging strategy, therefore, the nonresi- 
dential use of the foraging radius leaves nonsite archaeological remains that are just as 
important for archaeologists attempting to predict the operation of these past 
sy stencs as are the more clustered and visible materials that are usually called sites. 
This problem of continuous distributions will be discussed at greater length later in 
this chapter. It 1s quite likely that some components of a// human systems leave 
dispersed archaeological remains with low visibility, and these remains must be 
studied and understood before the mechanisms behind the placement of activities 
in systems can be explained and used to predict the locations of those activities 
accurately. The record left by expedient activities may be far more easily under- 
stood than that of the more logistically organized portions of past systems. 


Under a logistically organized system the nature of the intra- and interassem- 
blage variability can be expected to be very different from that predicted for 
foraging systems. Collectors use specially organized, highly mobile task groups to 
accommodate situations in which consumers are near one or more critical resources 
but distant from others. In addition to residential basecamps, these groups also 
utilize field camps, stations, caches, and other places for specific functions. Field 
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camps under such systems probably outnumber residential camps by as much as 4:1 
(Judge 1973). Since these camps can be occupied for long periods and or be the sites 
of intensive processing, they may become as large and visible, archaeologically, as 
residential bases (Binford 1980). As noted above, groups organized under a collect- 
ing strategy will be likely to employ a curated technology to some extert, given 
their high levels of mobility and activity planning. Discard of those curated tools 
that are employed primarily away from the residence rarely takes place at the locus 
of their use. Collecting strategies are based upon prediction or planning and should 
be expected to occur for the most part in the most predictable environments. This 
means that places where archaeologists today should best be able to predict the 
locations of sites on the basis of resource distributions will harbor assemblages that 
are unlikely to reflect the activities that took place there, since they will have less 
functional correspondence with the ‘tresources’’ that are used as independent 
variables in predicting them. 


The argument might be made that it is not necessary to know the functions of 
sites to be able to predict their occurrence —that using proxy indicators that can be 
measured in the environmental today and that “predict” the occurrence of sites 
empirically works just as well. This may be true in certain situations, but proxy 
indicators should not be expected to occur isomorphically with the reasons that 
activities took place at cert >in locations in the past in all cases. It is the mechanisms 
behind the placement of act ities in space and their resulting archaeological record 
that must be understood in order to successfully predict the occurrence of activity 
loci. 


The Reuse of Places and Intra-Assemblage Variability 


Attempts to predict the occurrence of sites that result from the operation of 
logistically organized systems are further complicated because places are reused for 
different purposes, so that many different combinations of activities may take place 
at a single site. For instance, a place might be used as a residential base for several 
months and thus contain tool manufacturing and maintenance-related debris. If the 
site were subsequently used as a field camp, the discarded materials from this 
second use may not faithfully represent activities that actually occurred there. A 
wide range of technological variability of specific and easily differentiated types can 
be expected in the archaeological record produced by a collecting-based systemic 
organization. Investment in such facilities as structures for shelter or storage, 
caching of items to be used later at the site, and other cultural “improvements” of a 
place would also be expected at reused places under such a system. This means that 
differential site function in a logistically organized, collecting system might not be 
obrious on the basis of either site size or site contents. Indeed, as Thomas (1983:80) 
points out, 


it ss extremely difficult to distinguish field camps trom base camps in the archaeological 
record. There are behavioral differences to be sure, but these differences are commonly 
subtle and off-the-cuff field designations should always be mistrusted. 
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Interassemblage Variability and Mobility 


The variability among assemblages at different sites that result from the 
operation of a single system —that is, interassemblage variability —is the result of the 
overlay of an organized series of events. The nature of assemblages that result when 
cultural events interact differentially with natural events has been discussed in 
terms of “grain size’’ by Binford (1980:17). Coarse-grained assemblages are the 
cumulative product of events spanning relatively large time periods, for instance 
several months or a year. Fine-grained asseiubiages accumulate over a short period 
of time. The finer the assemblage grain. the greater the probable content variability 
among assemblages, because there is less chance that the total range of activities that 
occur under that system will be found there. The main factor responsible for grain 
size is mobility, but this relationship is far from simple or linear. In a foraging group, 
residential mobility would be expected to be highest in the least diverse, least 
seasonal, and least predictable environments, resulting in an increase in inter- 
assemblage variability. Under logistic strategies, residential mobility goes down, so 
coarser-grained assemblages would be expected in residential sites; the more 
mobile logistic components would, however, be finer grained than the residential 
sites and would thus, as a class, exhibit more interassemblage variability. 


The Explanation of Intra- and Interassemblage Variability 


Two major expectations concerning the relationship between assemblage 
variability and differing degrees of residential vs logistic mobility have been dis- 
cussed above. One expectation is that under increasing logistic mobility the effects 
of curation and the reuse of places will make it increasingly difficult to postulate the 
functions of sites or to predict their occurrence in terms of association with 
particular resources. The other expectation is that under increasing logistic mobil- 
ity there will be increased interassemblage variability, both between residential 
basecamps and special-task locations and among different special-task locations as 
well. The archaeological record in this latter case may appear as a series of sites that 
are relatively uniform in size, visibility, and contents in terms of structures or 
facilities but contain assemblages that are strikingly different in terms of the formal 
attributes of their constituents or at least some of their constituents. 


One of the ways of explaining an archaeological record like that described 
above is in terms of separate technical or cultural traditions, an approach that has 
been dominant in American archaeology since the science’s beginnings (Willey and 
Sabloff 1974). This approach, which has been referred to as the Kriegerian method 
(Binford and Sabloff 1982:143), defines culture types as collections of formally similar 
properties or attributes of cultural materials that are spatially coherent. Data 
collected and interpreted using this approach pose serious problems for archacolo- 
gists and cultural resource managers who wish to understand the operation of past 
human systems and the mechanisms behind the archaeological record, yet such an 
understanding 1s critical to successful prediction of the locations of archacological 
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materials. A Knegerian, culture-type approach to assemblage variability virtually 
assures that the differentiated components of systems will be treated as separate 
cultures or traditions, making it impossible to consider them as parts of an inte- 
grated whole. And since the components of a system do operate in an integrated 
manner, their locations are just as dependent upon the nature and locations of the 
otber components as upon environmental or other factors. Successful prediction ts 
necessarily based upon the recognition and sorting out of the complementary 
components of systems. Unfortunately, this is something that archaeologists cannot 
do at present, although attempts toward this goal will be discussed later in this 
chapter. 


Technological vs Ecosystems Organization 


The practice of grouping assemblages on the basis of formal similarity encour- 
ages an emphasis on empirical correlations between assemblages (site types or 
culture types) and environmental variables, a practice that is the hallmark of 
present-day prediction attempts. Mazel and Parkington (1981) suggest that a more 
productive approach might consist of regional studies of the interrelationships 
among tools, sets of tools, and resources. These interrelationships are controlled, 
they feel, primarily by the spatial patterning of resources (rather than simply by 
their location) and by the ways in which resource patterning compares with the 
spatial patterning of human mobility within a system. In other words, prediction 
might be based not only on an understanding of human systems but on knowledge 
of ecosystems as well. Ecosystem variables include the patterning of resources in time 
and space and such qualities as environmental diversity and equability. The effects 
of ecosystemic spatial and temporal structures on the predictive effort will be 
discussed later in this chapter. 


Selection of the cultural variables against which to compare ecosystem varia- 
bles may be one of the most difficult tasks presently before the archaeologist. It will 
require very different approaches to sampling, survey, and data collection, record- 
ing, and analysis than are used in cultural resources management today. The 
assemblages that constitute sites must be understood in their entirety — 
undiagmstic artifacts as well as diagnostic ones. One new approach, a nonsite or 
distributional archaeological survey method, was recently tested by the Bureau of 
Land Management in New Mexico. This project will be discussed in a later section 
of this chapter. 


From a systems perspective it is clear that, at least under certain types of 
mobility and technological organization, the contemporaneous technological “‘tra- 
ditions” often identified in the archaeological record are actually functionally 
different parts of the same system. Most of the archaeological record in any one 
place may consist of the remains of different portions of an essentially similar 
system—remains that have been deposited over very long periods of time. The 
archacological record is not directly explainable in terms of episodic behavior; 
rather, 
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a detailed consideration of the factors that differentially condition long-term range 
occupancy Of posite»ning in macro-geographical terms us needed before we can realisti- 
cally begin to develop a comprehension of . . . subsistence-settlement behavior. The 
latter us of course necessary to an understanding of archaeological site patterning 
|Binford 1980:19; emphasis orgyn.\}. 


NATURAL FORMATION PROCESSES AND THE 
ARCHAEOLOGICAL RECORD 


The complex patterning of cultural materials across space is a result of human 
mobility, the spatial patterning of different economic activities, the redundancy in 
economic activities across the landscape, and differences in the locus of artifact 
discard vs that of artifact use. In most cases, this patterning of discarded material 
undergoes additional changes before it is discovered and interpreted by the 
archaeologist (Schiffer 1972, 1983). Processes affecting the deposition, accumulation, 
preservation, disturbance, and exposure of the materials that make up the archaeo- 
logical record have been much investigated in recent years, largely due to such 
interdisciplinary influences as the study of taphonomy of culturally utilized or 
modified organic and inorganic materials (Behrensmeyer and Hill 1980; Brain 1967a, 
1967b, 1969, 1981; Gitford 1977a, 1977b, 1980, 1981; Gifford and Behrensmeyer 1977) 
and geoarchaeology (Butzer 1977, 1982; Gladfelter 1977). 


Deposition: The Coincidence of Natural and Cultural Events 


Cultural materials enter the archaeological record through deposition, during 
which process they are buried or otherwise preserved. Although depositional 
processes may be cultural, in most cases they are natural, consisting of aeolian, 
fluvial, lacustrine, or residual aggradation. These natural processes of deposition 
may or may not coincide with episodes of cultural discard. Materials discarded as 
the result of an ~ccupation or activity might lie on the surface for long periods (in 
fact, ““forever’’) without being buried, or they may be quickly buried even as they 
are discarded. Materials buried in layers or “levels” are thus not necessarily or even 
not often expected to be the result of single occupational episodes. The nature of 
the deposited archaeological record is controlled by the periodicity or “‘tempo”’ 
(Binford 1982: 16) of occupation or use of a place and by the relationship between this 
occupational periodicity and the periodicity of depositional processes. If the perio- 
dicity of discard is the same as the periodicity of natural occurrences —for instance, 
floods —that incorporate these the artifacts into sediments, then a regularly strati- 
fied archaeological record will result. If discard occurs more often than the natural 
encapsulating events, however, cultural materials resulting from multiple behav- 
ioral episodes — multiple activity sets, in Carr's (1984:113) terms—will be incorpo- 
rated into the same geomorphic stratum. 


In situations such as the complete radius leapfrog pattern of residential 
mobility, for instance, in which certain logistic sites may be reoccupied or reused for 
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different activities within a short period of time, one might expect that episodes of 
discard would occur more frequer-tly than <pisodes of deposition. This would result 
in single-layer assemblages, or what Carr (1984:114) calls depositional sets, com- 
posed of materials from more than one occupation or function. The nature of the 
deposited archaeological record 1s determined not only by the organization of the 
cultural system but by interactions between the organizational system and deposi- 
tional processes. This poses another set of problems for the archaeologist, since 
““demonstrably associated things may never have occurred together as an organized 
body of material during any given occupation” (Binford 1982:17-18). 


Postdepositional Processes 


Another set of processes affecting the ultimate nature of the archaeological 
record can be thought of as postdepositional, occurring after the discard of cultural 
materials. Generally, almost any process that disturbs or acts upon the surface of the 
earth and subsurface deposits also acts upon archaeological materials. Such biologi- 
cal processes as faunalturbation and floralturbation (Wood and Johnson 1978), 
caused by burrowing, trampling, and root-heave, can modify the original distribu- 
tion of cultural materials. Chemical and physical processes that affect the archaco- 
logical record include freezing and thawing cycles; mass wasting (gravitational 
forces); the growth and wasting of salt crystalline structures; the swelling and 
shrinking of clays; volcanism and tectonism; disturbances caused by the action of 
gas, air, wind, and water; and pedogenesis. 


A somewhat different taxonomy of the postdepositional processes acting on 
the archaeological record is advanced by Foley, who presents five sets of processes 
responsible for burial, movement, destruction, exposure, and “small-scale oscilla- 
tion” (1981a:167) of archaeological materials. Discarded artifacts enter the archaeo- 
logical record through burial by cultural or natural agencies; once assemblages are 
buried they may remain in place or they may be moved through stream action, 
sediment movement, faulting, or mass wasting. At the same time, certain materials 
may or may not be destroyed by physical and chemical agencies while in or on the 
ground. Small-scale oscillation processes include animal burrowing, human disturb- 
ances, root action, and water or wind action; these forces may alter the position of 
components of the archaeological record slightly but presumably do not totally 
disarrange it. Exposure of the archaeological record to water or wind erosion, 
tectonic activity, or human disturbance may alter the distribution of the archaco- 
logical materials as well as make them visible. 


Just as variations in the coincidence of episodes of discard vs episodes of 
deposition or burial can create either well-segregated assemblages or palimpsests 
(that is, artifact distributions resulting from the overlay of many separate behavioral 
episodes and the action of postdepositional processes), exposure and reburial can 
also introduce complexities in archaeological patterning. These processes are rarely 
simply gravitational; they usually include some lateral component and theretore are 
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influenced by small variations in topography. Exposure and redeposition are often 
highly localized; deposits from separate occupations may be mixed in one area while 
a few meters away they will be separately stratified. Controlling for the complexi- 
ties caused by differential deposition, exposure, and reburial of artifacts may be one 
of the most difficult and yet necessary tasks facing the archaeologist. Whatever the 
scale at which patterning in the archaeological record is being analyzed, the 
microtopography and geomorphological activity of surfaces must be examined in 
more detail than that afforded by most generally available topographic or surface 


unit maps. 


The Scale of Depositional and Postdepositional Processes 


Natural depositional and postdepositional processes are not necessarily or even 
often controlled by the factors that caused prehistoric people to visit and use an 
area. Depositional and postdepositional processes are localized and patterned on a 
small scale. Rarely, then, will the actions and results of these natural processes be 
spatially congruent with activity areas or assumed sites. Instead, their effects serve 
to remove the archaeological record yet further from past behavior and the organi- 


zation of human systems. 


This is not to say that natural processes necessarily render the archaeological 
record useless or uninterpretable. It is common in contemporary archaeology to 
view postdepositional processes as “bad,”’ as making the archaeological record 
unusable or of diminished research potential. This probably arises from the seem- 
ingly popular belief that postdepositional processes are random in their operation 
(Bowers et al. 1983; Kirkby and Kirkby 1976). Almost all modern survey forms have a 
space for an assessment of a site’s integrity; if the site is disturbed, it is too often 
classed as being of limited utility to science and therefore of diminished significance. 
Such an assessment ignores the fact that all archaeological materials, whether from 
“sealed” sites or lying on the surface, have been affected by natural processes. 
Depositional and postdepositional processes are not random in nature; in order to 
assess their effect on our data, however, we must study and understand these 
processes so that we can predict their distribution and impact. Any prediction of the 
occurrence of archaeological materials mast incorporate a full consideration of the 
effects of depositional and postdepositional processes as intervening factors 
bet ween the operation of past human systems and the archaeological record. 


This is necessary because the effects of postdepositional processes on what we 
see as the archaeological record may be far greater than we intuitively recognize. 
They not only disarrange flakes and tools but in fact are almost totally responsible 
for most of what archaeologists actually se during surface survey. If the physical 
extent of behavioral events that result in discarded materials are of the same general 
range of spatial scales as the depositional and postdepositional processes, then there 
is some chance that entire sites will be exposed to the archacologist’s view. 
Unfortunately, it is almost inconceivable that this will be the case. The material 
record will almost certainly be acted upon by a series of partially overlapping 
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depositional and postdepositional processes of widely varying scales. These proc- 
esses will combine the products of behavioral episodes; blur or sharpen (and in fact 
probably often create) their apparent boundaries; and differentially affect the place- 
ment of artifacts, depending on their sizes and shapes. These eilects are all- 
important, for they determine where we see sites and what these stes look like. 
They also may be responsible for the fact that we think we see “sites” at all in many 
places. These processes must surely be determinable and predictable. The natural 
processes that intervene between the archaeological record and our knowledge of 
the past must be understood before predictive modeling can become an operational 
tool for cultural resource management. This task is discussed and illustrated at 
length in Chapter 9 of this volume, which deals with remote sensing and predictive 
modeling. 


The Usefulness and Integrity of Surface Remains 


Recently Lewarch and O’Brien (1981) argued that surface assem dlages can be 
used to answer archaeological questions because comparable processes affect the 
patterning of artifacts in both surface and subsurface archaeological assemblages. A 
more realistic way to phrase this might be that archaeologists should be aware that 
natural and cultural processes can affect subsurface or “sealed” archaeological 
patterning just as strongly as they affect surface materials, so no automatic assump- 
tions of total contextual integrity should be made for any observed archaeological 
patterning. 


There are important pragmatic reasons for developing method: | 0 measure the 
patterning and content of archaeological surface remains and for usiiug such data to 
answer archacological questions. Ot these, the most relevant to the present volume 
is that the depositional processes that seal and protect cultural materials after their 
discard usually render these materials invisible to the archaeolog.st, even when 
such sophisticated and often expensive techniques as underground radar, proton 
magnetometry, resistivity measurement, and the like are used to search for them. 
For practical purposes, most buried archaeological materials are unknown and of no 
value to the archaeologist until they are exposed. Another reason for paying 
attention to surface assemblages is that the contexts in which stratified deposition 
and burial are most dependable and regular, and in which archacologists most often 
look for and find buried materials, may be the result of only very limited or 
specialized portions of the cultural systems. For instance, while cave sites contain 
well-segregated and well-preserved cultural strata, such sites might have been 
occupied only when the shelter they afforded was necessary, or they may have been 
used only for a specific set of purposes. Most of the components of the cultural 
system may have involved the use of open situations that would be more likely to be 
burned and reexposed, or perhaps not buried at all. Thus, in the archaeological 
record these components would be represented only by surtace assemblages. 


Possibly the best reason for using surface archaeological assemblages, however, 
is that such data can be collected quickly, accurately. and cost-effectively, and they 














THEORETICAL BASIS AND DATA-COLLECTION METHODS 


yield a high return in the form of information that can be used to test models of 
human systems organization. In order for us to use this information, however, it is 
imperative that surface archaeological data be discovered, measured, and analyzed 
in ways that are consistent with their nature and with the nature of the organiza- 
tional processes that we wish to explain, as documented in the final section of this 
chapter. 


Natural Processes and “Independent Environmental Variables” 


The importance to predictive modeling of an understanding of postdeposi- 
tional processes becomes clear if we consider the “independent variables” fre- 
quently discussed by archaeologists involved in locational predictive modeling. 
These independent variables are the noncultural aspects of the total environment 
that correlate with site locations. Under an empirical framework these variables are 
used to “*predict”’ (project) site locations. Commonly used independent variables 
include soil association, slope, elevation and or variation in elevation, topographic 
aspect, vegetation, distance to water sources and their nature, and various specific 
landform associations (Chapter 9). It is almost always explicitly acknowledged that 
these independent variables themselves may have no causal relationship with the 
placement of sites; they are simply considered to be indicators. In many instances, 
variables may be chosen primarily because they can be taken convemently and 
quickly from topographic maps so that fieldwork is not required; some of the pitfalls 
of this approach will be discussed in Chapter 9. 


In addition, trying to generalize about where prehistoric people lived on the 
basis of where we find their discarded materials circumvents the explanatory 
framework outlined above by equating the archaeological record with past behavior 
without taking intervening processes into account. Correlating environmental 
characteristics with the archaeological record must begin with a consideration of the 
natural processes that determine how we see the archacological record. Every one of 
the independent variables used in empirical, correlative projections could be a 
successful predictor because it has relevance to natura depositional and postdeposi- 
tional processes (and thus to the visibility of archaeological materials) rather than 
for any cultural reasons. 


For example, archaeological materials might be found on ndge tops, in sand 
dunes, or near water sources because that is where they are exposed and visible 
today. Soil associations are taxa of different types of soils, and these diflerences are 
based largely upon varying parent materials and the time that the soil has had to 
develop, both of which may affect the geomorphic processes that cover or uncover 
artifacts. Vegetation is an obvious factor in reducing or enhancing archacological 
visibility. Erosion takes place at accelerated rates on steep slopes. And any archacol- 
ogist who has tried to survey the north side of a hill in the early morning or late 
afternoon knows that the light there is poor; things can simply be seen better on 
south slopes. There is not a single independent variable used in current predictive 
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modeling attempts that might not have more to do with depositional and postdepo- 
sitional processes than with anything that prehistonc people thought or did. 

Predictive modeling based on correlations with these variables may actually be 
predicting where we see sites and may have very little to do with how people 
behaved or how their systems worked. This 1s not to say that the archaeological 
record has no systemic, behavioral determinants; previous sections of this chapter 
have emphasized that it does. The pomnt 1s that natural processes are very important 
in determining many aspects of the nature of the archaeological record and how we, 
as archacologists, can deal with it. They must be thoroughly understood before 
predictive modeling can become either a management or a research tool. Ways of 
measuring, understanding, and even predicting the effects and distribution of 
natural depositional and postdepositional processes will be discussed more exhaus- 
tively in Chapter 9. 


ECOSYSTEMS VARIABLES AND ARCHAEOLOGICAL 
EXPLANATION AND MODELING 


As defined in the first section of this chapter, archacological explanation 1s the 
process of combining middle-and upper-range archacological and anthropological 
theory with ecosystems theory to form models from which predictions are drawn. 
This process begins at the systems level, and archaeological models connect sys- 
temic human organization with predictions about the archacological record. 


Human systems obviously exist within ecosystems—they are subsets or com- 
ponents of ecosystems. Ultimately, the nature and predictability of human systems 
and their products will be related at least mn part to the natural ecosystem. This 1s an 
explicit assumption in all predictive modeling or projective attempts known to the 
authors of this chapter. In fact, the almost universal approach for such attempts 1s to 
compare the distribution of archaeological materials with “environmental varia- 
bles” that are suspected of having been important to past people: the availability 
or lack of water, shelter, firewood, food species, lookouts, south-facing slopes, etc. 


This section will discuss the use of ecosystem variables rather than particularis- 
tic environmental resources in the process of archacological explanation. Ecosystem 
variables have considerable explanatory power when incorporated into models of 
change in human systems in response to ecosystem properties; they also have 
implications for the ultimate “predictability” of locations of cultural resources m 
different ecosystemic settings. In keeping with the principle of congruence in levels 
of systems being compared, it is important to examine the global characteristics of 
the structure of the ecosystem in order to predict something about the structure of 
the human organizational system inhabiting it (Figure 4.4). On a lower level, the 
spatial and temporal distribution of that environmental structure is important for 
predicting the spatial and temporal distribution of the human system exploiting it. 
At a still lower-order level in both systems, it 1s umportant to be able to characterize 
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the distnbution of particular resources in order to predict the location of specific 
prehistoric activities and their archaeological manifestations. The extent to which 
this is possibie, however, will depend on higher-order characteristics of both 


systems. 


Human Systems Within Ecosystems 


Ecosystems are composed of individuals enmeshed in populations, interacting 
with other populations in communities. Ellen (1982:74) has provided us with a useful 
modern definition of the ecosystem as 


a relatively stable set of organic relationships in which energy, material, and information 
are m continuous circulation, and in which all processes are seen im terms of thei 
system-wide repercussions. Speafu changes, which may theoretically begin anywhere in 
the system, tngger adjustment and re-adaptation among the other elements. . . . Systemn 
changes take place slowly through conjoint evolution that 1s biological, chemical, and 
physical. 


The ecosystem composed of these interacting communities is another example 
of a general living system and likewise exhibits a mixture of predetermined behavior 
and free systems dynamics, as discussed earlier in this chapter and in Buechner 
(1971:45). The species composition of particular locations in a forest, for example, 1s 
always changing in response to fire or other perturbations, although species compo- 
sition and dominance in the larger forest may remain relatively stable. Species 
composition in seral (i.¢e., successional) communities varies according to both 
random and predetermined processes (Buechner 1971:52-53). 


On an abstract systems level, a number of relationships between ecosystemic 
characteristics and aspects of settlement systems have been demonstrated or 
suggested. Binford (1980) remarked upon the increasing importance of both logistic 
mobility (collecting) and storage among hunter-gatherers in environments with 
increasing seasonality. He notes that foragers, who practice little storage or logistic 
collecting, tend to move from the center of one resource area to the center of the 
next. Kelly (1983) has argued that the resource “‘accessibility”’ (the amount of time 
and effort required to extract resources from an environment) of plants can roughly 
be estimated by dividing the net above-ground primary productivity of an environ- 
ment by its primary (plant) biomass; animal accessibility is roughly measured by 
dividing secondary biomass by primary biomass. (Net primary productivity is the 
rate of increase over some unit of time in biomass, usually measured in calories.) 
Kelly finds that as resource accessibility measured in this way decreases, residential 
mobility increases. Low resource accessibility and high residential mobility are, in 
turn, correlated with short distances between sequential residential bases, as is 
typical for foragers in the tropical rainforest. 
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Some Factors Affecting the Predictability and Location of Human Use of Space 


An appreciation of mobilzty is vital to our understanding of how the archaco- 
logical record is formed. The causes and consequences of mobility are only part of 
what we need to know, however, in order to predict the locations and characteris- 
tics of past human behavior. In particular, we need to consider the middle and 
lowest levels in the systems of hierarchy shown in Figure 4.4, both for ecosystems 
and cultural systems. The middle level for scosy stems consists of information «bout 
the spatial and temporal structure of the ecosystem in some region of interest. The 
lowest level consists of information about how the distributions of specific resources 
make up the patches and about specific environmental features (soils, landforms, 
etc.) of the landscape. 


Of the many kinds of knowledge that might improve our ability to understand 
settlement systems and to estimate how well site locations may be predicted, three 
dimensions of variability a: > most important: the temporal and spatial rariability 
resource availability and the degree of economic imtenufication of the people exploit- 
ing those resources. We will first define these three dimensions of variability and 
then explore the effects of each variable on settlement systems; each variable wii 
first be discussed as if it were possible to hold the other two constant. Finally, we 
will give some concrete examples of how these three independent dimensions of 
variability can be used to characterize various settlement systems and environ- 
ments in terms of the likely success of the prediction of settlement locations. 


Spatial heterogeneity in the landscape is called patchiness, a term that 1s not 
readily quantifiable but refers to significant spatial discontinuities in the distribu- 
tion of populations or communities. Intuitively, it is the opposite of homogeneity; 
although all ecosystems are patchy at some scale, the relative homogeneity of the 
tropical rainforest, for example, distinguishes it from the relative patchiness of a 
semiarid landscape. Patchiness encompasses aspects of environmental variability 
that are measurable, including the size and size distribution of patch types, the 
relative differences between patches and ther surroundings, and so forth (Winter- 
halder 1980: 153). 


Three terms are especially useful for describing the temporal distribution of 
resources (Colwell 1974; Winterhalder 1980: 162-163). Constancy is a measure of the 
degree to which a resource is continually available. Rainfall has a high constancy in 
tropical rainforests but a low constancy in most areas of the North American 
Southwest. Contingency ts a measure of the degree to which the availability of a 
particular resource can be accurately predicted based on the season, without the 
need for monitoring that resource. In many areas of the Pacific Northwest, anad- 
romous fish runs have high-contingency predictability even though they are not 
constant. Perfect temporal predictability for a resource can be due to perfect con- 
stancy, perfect contingency, or a combination of the two. For example, Bella Coola, 
British Columbia, has moderately predictable rainfall patterns owing to relatively 
high constancy coupled with relatively low contingency. Acapulco, Mexico, has 
equally predictable rainfall as a result of low constancy coupled with high contin- 
gency (Colwell 1974:1151). 
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Intensification has at least two manifestations. It may refer to the process of 
expending increasing amounts of time or energy to realize the same level of returns, 
or it may describe the process through which the same amount of output is obtained 
from less and less land—either through increased time or labor :nputs or through 
more efficient technology. Intensification appears to be closely related to anumber 
of factors: increasing involvement of groups beyond the family in the regulation of 
production (Sahlins 1972:101-140), increasing population, increasing population 
density, approach to a current carrying capacity, and increasingly complex socio- 
political organization (Harris 1977), to name a few. Harris’s position (1977:70) that 
increasing population and increasing population pressure on resources results in 
intensification of land and labor, which in turn causes increasing sociopolitical 
complexity, may be too unilinear, but the general correlation of this system of 
variables is clear. 


Boserup (1965), Binford (1983:195-232), and many others have discussed factors 
that may be seen either as the causes of intensification or as its symptoms: 
increased population size and packing, decreased mobility, the beginnings of 
serious agriculture, increased sociopolitical complexity, increased importance of 
exchange, the rise of urbanism, and so forth. Intensification is used here simply as 
the name for this large system of covarying variables, organized along the lines 
proposed in Table 4.1. Under certain circumstances intensification may involve the 
adoption of agriculture (Binford 1983:205) or the development of industrialism 
(Wilkinson 1973). 


Some of the following discussion of the effects of spatial and temporal distribu- 
tion of resources and degree of intensification on human settlement systems is 
exploratory, and we know of little empirical proof for some of the relationships 
suggested. This is a starting point for further work in this direction and serves as a 
qualification to simple empirical correlations of the locations of sites with environ- 
mental variables. 








TABLE 4.1. 
Selected correlates of intensification 
Degree of Intensifuation 
Low ~ ~ High 
Casual or 
Extensive Intensire 
Correlacs CS CForaging® ———Collecting® _——sdDomestucation Domestication 
Modal group size small (18-120) moderate large large 
Generic site types 1-2 5 of more many many 
Residential mobility high (15-50 moderate to low low low 
moves per year) 

Investment in facilities low moderate high very high 
Storage very little; food seasonal seasonal long-term 


gathered daily 





* All information on foraging groups and generic site-type information on collecting groups from Binford (1980) 
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Intensification. What is the importance of intensification for our ability to predict 
where sites might be located in space? What implications might it have fr the value 
of the concept of the “site”? The least intensified hunter-gatherer economies, 
people practicing a foraging way of life, should exhibit high residential mobility, 
practice little storage, gather or hunt food almost daily, and conduct much of their 
hunting on an encounter basis. The Dobe !Kung and the Central Kalahari San 
(Tanaka 1976) provide good examples of foragers. 


Although all human systems may well exhibit some foraging subsistence, and 
thus mobility, behavior, the purest examples of foraging should be found in tropical 
areas where there is relatively little seasonal pulse in the availability of resources. 
Ignoring for a moment the effects of such ecosystemic factors, the following 
observations about foraging systems in general can be made: 


1. In comparison with logistically organized hunters and gatherers (collec- 
tors), foragers should exhibit low population densities and expend relatively 
little energy in food transport and processing for storage. 


2. The tendency for foragers to move themselves to food and water, rather 
than vice versa, suggests that distributions of such resources may in general be 
good predictors of residential bases (if these can be distinguished in the 
archaeological record). As a cautionary note, however, see comments by Foley 
cited earlier in this chapter. Yellen (1976:52) also observes that the !Kung San 
in the northern Kalahari— whose site locations are heavily constrained by the 
availability of water—generally locate their residential bases at least one-half 
kilometer, and often much farther, from a water source so as not to disturb the 
animals that also make use of the water. 


3. Unless the environment is very homogeneous, or unless there 1s a single 
resource that is overridingly critical (such as water), however, the residential 
bases of sequential foragers may be located with respect to different suites of 
resources, since residential bases are used for a short time. 


4. The low population density of foragers suggests that there may be a low 
tendency toward reuse of residential bases (what Binford | 1980:7] calls redun- 
dancy in the occupation of particular places) except where there are significant 
topographic or other constraints in the physical environment. 


5. Given long-term use of an area these last two observations may mean that 
all favorable resource locations will be occupied. But the small group sizes, 
short duration of occupation, and low rates of residential reoccupation will 
lead to low archaeological visibility, low artifact density, and little bounded- 
ness in space, making application of the “‘site”’ concept relatively difficult and 
arbitrary. Groups practicing foraging also conduct activities away from their 
residential bases, and activities at these “locations” (Binford 1980:9) can be 
expected to leave only very low densities of archaeological materials that do 
not correspond to established notions of sites. 


The logistically organized subsistence-settlement system of collectors repre- 
sents an intensification compared to foraging. A landscape in which foraging 1s 
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possible should be able to support more collectors than foragers owing to the 
collectors’ increased efficiency im exploiting spatially disyunct but temporally con- 
current resources and in overriding temporal disyunctions of resources with storage. 
The implications of this particular intensification for hunter-gatherers have been 
explored earlier (see Table 4.1) and include greater reuse of some places in the 
landscape, but not necessarily for the same purposes; greater degree of disyunction 
between those places and any single “critical” resource; and a wider variety of site 
types, which can be expected to differ dramatically in thew locational determinants. 


Active management of plant and animal resources—including 
domestication —entails an additional intensification of the hunter-gatherer way of 
hfe. In environments where both collecting and agriculture are feasible, a particular 
landscape should be able to support more agnculturalists than collectors, since the 
former more effectively exploit the potential net primary productivity and over- 
come temporal discontinuities in resource availability. Although there are some 
climates in which storage 1s difficult, most domesticators of plants and animals 
practice more storage than hunter-gatherer groups. Increased storage may lead to 
increased investment in facilitues and increased residential sedentism (Hitchcock 
and Ebert 1984). 


Although agriculturalists decrease their residential mobility in comparison 
with most hunter-gatherer groups, their logistic mobility is not necessarily 
decreased; in fact, owing to the heavily altered nature of the foraging radius 
surrounding agriculturalist settlements (Kohler and Matthews 1988), logistic mobil- 
ity may be more frequent, and encompass a wider radius, than among groups with a 
more mobile residential base. Among these groups, however, logistic procurement 
as a means for coping with resource shortages is increasingly supplemented by 
exchange networks involving subsistence and or sumptuary items. (This 1s not to 
imply that such networks cannot be important to nonagriculturalists im certain 
circumstances, as 1s amply demonstrated by some Archaic period groups in eastern 
North America or by the trade network in the Pacific Northwest centered on The 

atles.) With increasing sedentism, trips away from the residential base are increas- 
ingly likely to emphasize interaction with other groups, rather than direct resource 
collection from the natural environment, as their primary goal. 


In general, the effects of increasing intensification in the absence of changing 
ecosystems variables can be summarized as follows: 


1. resedential mobility tends to decrease; 


2. environmental perturbation in the vicinity of residential sites tends to 
increase; the orginal environmental communities are replaced by commum- 
ties at a less mature stage, with higher net primary productivity; 


3. logistic mobility and its supplement or surrogate —exchange —tend to 


inmcrease. 


Given the mcreasing importance of exchange relationships as a supplement to 
logistic mobility for providing access to resources outside the foraging radius, the 
location of other groups—and other components of the settlement systems of a 
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single group— becomes an increasingly important consideration in the location of 
residential sites. The significant disruption in the foraging radius surrounding the 
residential sites of agriculturalists and the possible investment im facilines within 
this radius (irmgation systems, for example) result in considerable pressure to keep 
residential sites outside the foraging radu of other residential sites. On the other 
hand, special economuac, social, or political ues with other groups may dictate that 
inter-residential distances not be too great. 


One implication of these changes for the risibility of the archaeological record 1s 
that intensification should lead to increasing visibility for residential bases because 
of decreased seasonality of occupation, increased longevity of occupation, increased 
investment in storage and dwelling facilities, and increased alteration of the natural 
environment. 


Implications of intensification for the visibility of site types other than residen- 
ces are more complicated. For locations within the foraging and field radius of the 
residential base that have relatively stable resources, such as arable soils, location 
reuse may be routine, eventually resulting in high site visibility. Within their 
foraging or field radius, agriculturalists or intensified hunter-gatherers invest more 
in facilities and revisit these facilities more frequently than do groups that regularly 
move their residential bases long distances; this may help to explain the relatively 
high visibility of “*fieldhouse” sites in the American Southwest. Locations where 
some nonrenewable or slowly renewable resource such as wood is exploited, 
however, may be used in a way that is not substantially different from or more 
visible than the way that foragers use locations away from their residential bases. 


To summarize the effects of intensification for where sites will be located, 
residential sites should increasingly rep esent a compromise location (Figure 4.5). 
Either they should be located not too far from any of the resources that will be 
needed regularly during the increasingly long period that such sites are occupied, or 
they should be located near some important subset of these resources and count on 
kinship ties, trade, or usufruct privileges to obtain the remainder. These predic- 
tions refer to individual residential sites, since the total set of forager residential bases 
on a given landscape may be responding to as many different environmental factors 
as the total set of collector or agriculturalist residential bases. Within the economi- 
cally acceptable zone of possible residential base locations, considerations of comfort 
are not insignificant for a site that may be occupied for several years, and the 
locations of the residential bases of other groups become an increasingly important 
consideration as well. 


The definition of what is a suitable zone for residential sites — both economi- 
cally and from the perspective of comfort —may become bi vader under intensifica- 
tion. The increasingly complex technology that accompanies intensification per- 
mits intensive use of areas that are unsuitable for occupation by people with a 
simpler technology. The development of irrigation, for example, makes agriculture 
possible in places where it could not be practiced without irrigation. Variables 
determining residential base location cannot be assumed to be identical for groups 
at different levels of intensification. 


135 














EBERT AND KOHLER 














Intensification ——__—_, 


Number of relevant independent 
variables affecting location; 
independent site basis 





Proportion of the relevant 
independent variables that 
ore environmental 


Figure 4.5. Suggested effects of mcreasing wtensitication on the location of rendential sites 


Finally, site types other than residences may be located for very specific, 
single-resource considerations (for instance, clay or chert quarnes), or they may 
represent compromises among several variables that are weighted rather differently 
than they are for residences, as is probably the case with fieldhouses. 


Next, let us summarize the effects of intensification on the predictability of site 
location (that is, how strong the association between selected environmental 
variables and archaeological materials should be). The increased population packing 
under which intensification is expected to take place may mean that a smaller 
number of the places in the landscape that fulfill the requirements for use or 
settlement will remain unused; in a fully packed landscape, all suitable locations 
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may be used. This should make prediction easier in the limited sense that it should 
lower the frequency of wrong predictions about where sites are (Figure 4.6). 
Perhaps more important, residential bases for collectors or agnculturalists should all 
have similar environmental determinants within a particular settlement system, 
whereas forager residential bases within a single settlement system may have quite 
different determinants. The prediction that a single set of environmental determi- 
nants will apply to all residential bases for agriculturalists within a single settlement 
system 1s weakened, however, by the tendency for exchange to allow communities 
to occupy locations with access to complementary rather than redundant resources. 


The implications tend to complicate inferential locational modeling. Forager 
residential sites cause problems because they may be responding to different suites 











intensification ———— 


—— Concentration and visibility 


——-- Strength of association with 
a single set of independent 
variables (= predictability ) 


Figure 4.6. Suggested effects of mncreawng mtenstication on the conceniration and, hence, 
visibilty of archaeology al materials at residential sites and locations withen the toragung radius where 
nonexhaustible resources are explored and on the strength of association of cach of these ste types 
with a sengle set of independent varnables 
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of environmental variables and because they fit the concept of “site” poorly. 
Residential bases in more intensified adaptations are less subject to these particular 
problems, but predictions about their locations also have complicating factors. 
Locations of these sites represent a response to an increasing number of variables, an 
mcreasing proportion of which (the locations of other contemporancous sites, for 
example) cannot normally be used for prediction. This discussion makes clear the 
theoretical basis for Kohler and Parker's (1986) insistence on modeling different 
adaptation types in one area through tume, but in order to do this we must be able to 
“sort out™ the overlapping archacologscal records. 


Spatial Heterogeneity. Let us now briefly consider the effects of increasing spatial 
heterogeneity —patchiness — while sgnoring intensification and temporal predict- 
ability. The aspects of spatial heterogeneity that have the most important mmphica- 
tions for where sites will be located and how visible and predictable they will be are 
the degree to which the critical, nonsubstitutable resource patches overlap, the 
extent to which cach resource type is concentrated, and the distance between 
patches of substitutable resources. First, we suggest that the strength of association 
between the distribution of archacological materials and the distribution of a 
particular resource type (and therefore the predictability of those archacological 


materials) should increase as resource patches 


i. become more concentrated im space, so that equivalent resource-type 
patches are increasingly distant from one another; and 


2. overlap more m space with other nonequivalent (nonsubstitutable) 
resource-type patches. 


These proposed relationships are im accordance with common sense. The 
occurrence im a single location of more than one critical, nonsubstitutable resource 
say fuel, large game, and roots) increases the likelihood of use, and reuse, for that 
location. If equivalent resource types (for example, carbohydrate resources with 
similar processing requirements and storage characteristics) are tairly continuous 
across the landscape, the strength of association between archaeological materials 
and any one of those resource types should be low. Equivalently, if patch size 1s very 
large, or if patches are close together, predictive success will tend to be low. It 1s 
important to remember that resources mclude things other than food; fuel 1s 
probably universally needed, but other “amenities,” such as well-drained sedh- 
ments deep enough to enable construction of a pithouse, may be peculiar to 
particular adaptations. We do not necessarily know the identity of these food or 
nontood resources, however. 


Che visibility of archaeological materials, and to some extent the case with 
which the concept of “site” may be appled, should increase under the same 
circumstances in which predictability increases. The same environmental circum- 
stances that serve to bind environmental features and archacological materials 
closely together should also serve to concentrate those materials into sites. It should 
be noted, however, that sites, even in these systems, are not the remains of discrete 
episodes of behavior. Because concentrated materials are easier to find than 
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dispersed materials (Wandsmder and Ebert 1984), concentration increases visibility 
of clusters. We do not advocate the use of the concept of “sites” without recourse to 


the entire explanatory modeling process and exphcit recognition of the different 
meanings that the term aie might have. 


Where there 1s considerable spatial overlap among critical nonsubstitutable 
resources, a relatively small number of independent variables should adequately 
predict the presence or absence of archacological materials. That 1s, where there 1s 
strong spatial correlation among the potentially umportant environmental variables, 
a few may successfully stand for many. Where spatial overlap among resources 1s 
low, a larger number of proxy environmental vanables may be required for predic- 
tion. The general relationships suggested here between aspects of the spatial 
structure of critical resources and aspects of the predictive modeling process are 
graphically summarized mm Figure 4.7. 


Temporal Predictabilay. What, finally, are the effects of mcreasing constancy 
and contingency i the temporal distribution of various resources on the predictive 
process? Remembering that constancy and contingency can be summed to create a 
measure of temporal predictability, we propose that archacological materials will be 
relatively concentrated, visible, and predictable mm places where resources have 
esther high constancy or high contingency; archacological materials tend to be 
spatially predictable where resources are temporally predictable (Figure 4.8). (This 
prediction sgnores concurrent variability in the spatial structure of resources; 
obviously, spatial concentration or dispersion of resources, as outlined above, also 
aflects these relationships.) Places where both constancy and contingency im the 
temporal distribution of resources are low will not favor concentrated, repetitive, or 
long-term use and mm general should not be associated with resdential site types. 
High constancy of resource availability should favor low residential mobility, while 
high contingency should favor regular seasonal reuse. The coastal salt marsh sea 
island estuarine systems of Georgia and the Calusa area of southwest Flornda are 
examples of environments with high constancy m the resources critical to human 
survival (Marriman 1975). Most noncoastal North Amencan environments expe- 
rence greater seasonal pulses in temperature or precipitation, reducing the con- 
stancy of most critical biotic resources. The large rivers with thei runs of anadro- 
mous fish and the root-gathering areas of the Columbia Plateau provide good 
examples of high-contingency environments. 


The Interaction of Intenufwation, Spatial Heterogeneity, and Temporal Predutability 


Finally, how do these three dimensions of variability —economic mmtensifica- 
tion, spatial heterogeneity, and temporal predictability —tend to mteract? This is 
the mmportant question for predictive modeling, sence it 1s artificial to discuss these 
dimensions as of they were totally independent of one another. It seems obvious that 
certain kinds of spatial and temporal variability im resources require some mtensifi- 
cation practices — particularly storage — before the resources can be explonted at all. 
Arctxe adaptations to resources with low constancy, only moderate contingency, 
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Figure 4.7 
loc ational modeling 
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Fegure48. Suggested eftects of the temporal charactermtics of crtucal enveronmental resources 
on locations of archacologyal materials 


and large distances between critical resources are good examples of this. Other 
combinations of environmental factors allow either a forager way of fe or more 
intensified economies to thrive; under these conditions we might expect some 
historical tendency for the replacement of foragers by collectors and perhaps 
agnculturalsts following the competitive exclusion principle (Bettinger and Baum- 
hoff 1982; Kohler 1976). Relatively low constancy coupled with high resource 
productivity and relatively little spacial overlap mm critical resources has seemed to 
favor imtensification im many temperate portions of North America. This intensifi- 
cation involves mmcreased population, mcreased packing, decreased reudential 
mobulity, increased storage, and even production of storable foods. Still other kinds 
of spatial and temporal variability discourage or select against intensification. 
Foragers in the tropical rainforest exploit resources that have high constancy and 
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hugh spatial overlap m the critucal resources but low resource accessibulsty and litle 
spatial variability. These environmental factors leading to ugh remdential mobabty, 
m conyunction with the prevalence of tropical diseases and pests, keep populations 
below the foraging capacity of these environments. 


Implications for Inductive Empincal Models 


We have suggested that vanous economia and ecosystem factors affect (4) 
the number of relevant mmdependent (determinant) environmental vanables 
needed for accurate predictive locational modeling; (+) the extent to whach factors 
wn the natural environment, by themselves, will be adequate predictors of location; 
(¢) the strength of the association between vanous predictor wanables and the 
location of archacological materials; and (4) the concentration and visitulty of these 
materials. Let us now explore the unphcations of these suggestions for the mduc- 
tive, empincal modeling of site location commonly practiced today. 


First, there ts no reason to beheve that locations for all site types produced by 
all subsistence-settlement systems mm all environments are equally predictable. 
Other things being equal, predictability (strength of association with critical 
environmental factors) should be relatively high im landscapes where equivalent 
resource-type patches are concentrated and solated, have high overlap with other 
nonequivalent resource types, and have high temporal predictabilty. For remden- 
tial site locations, accuracy of prediction (which ws equivalent to the strength of 
association with relevant independent vanables) should mcrease in more mtensufied 
economes. But the location of rendential sites im such economies becomes an 
mecreasingly multivariate problem, and the independent vanables affecting location 
mereasingly include locations of other remdential sites—imformation not typically 
available to ot casily utihzed by mferential predictive models. 


Other problems for mferential predictive models involve the diflerential con- 
centration and visibility of vanous site types mn areas where the resources differ in 
spatial concentration, overlap, and temporal predictability, and m economies at 
differing levels of intensity. Residential bases become increasngly concentrated 
and visible under the same conditions that promote predictability, as reviewed 
above. Other site types may or may not become more visible under imtensification, 
depending on their function and location im relation to a residential base. Other 
things being equal, we assume that site types other than resdential bases will be 
underrepresented mm samples from most modern and all older surveys. 


Taking these pownts into connderation, it 1s unhkely that inferential predictive 
models will perform well mm areas where resources are not concentrated, overlap- 
ping, and temporally predictable, or where residential sites have low visibility (such 
as those of foragers) or high locational dependence upon factors of the social 
environment. On the other hand, we can expect inferential predictive models to 
perform relatiely well when the opposite conditions hold. 
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One pout of thes discusmon us that we cannot expect inferential approaches to 
be equally successful en all corcumstances, and even for all kinds of archacological 
mamuiestations within a sangle settlement system. Nor ss inferential prodacton hikely 
to be completely successful en most arphcations, both because of the comphcations 
outhned on ths section and because of a certam indetermunancy m all wing systems, 
mchoding settlement systems. Another umportant pomt ss that archacologusts noed 
to begun to characteruze the environments m which they work m terms of opera- 
nonalired, consustent, ccosystemuc factors, such as temporal constancy and contm- 
gency and degree of spatial concentration and overlap of resources, instead of by 
smple reference to the presence of absence of particular resources at partecular 
pownts on the landscape. Admuattedby , this will be difficult for even modern environ- 
ments, let alone tor paleoenvironments that differ from those of today, but we hope 
that thes sectson has pounted out the necesmty for such characterizations mm under- 
standing how settlement systems are structured and, therefore, how thew posts_.- 


ing on the landscape mught be predated. 


DISTRIBUTIONAL ARCHAEOLOGY 


Approaches to Congruence Between Theory and Method 


So far on this chapter we have discussed the effects on the archacologscal record 
of diflerences mn the orgamzation of human systems, of amumber of deposstronal and 
post depositional processes, and of general ecosystem (rather than sngle env iron- 
mental) variables. We have tned to show the umphcations of these different 
determumnants of the archaeological record for modeling and prediction. Sore torms 
of orgamzatior and some temporal and spatial attributes of ecosystems lead to the 
formation of an archacologscal record that 1 relatiwely more visible and predictable 
than records tormed under other orgamzational and ecosystem principles. We 
have suggested that the beast visible and least predictable archaeological record us 
created by toraging actrvitees —erther foraging components of generally logustically 
orgamzed systems or human systems whose subsistence activities are wholly 
orgamzed around thi mobility settlement strategy. 


What this means ws that an cxpectably large proportion of the archaeological 
record left anywhere by all past peoples will consist of relatively continuous, 
low-density, low-vissbility remains. Such an archacologx. al record cannot be dealt 
with using site-centered discovery and measureme..t methods; m fact, = may not 
even be detectable via traditional survey. In addition, the clustered materials that 
result from mtensive reuse of corcumscribed places (the things we think of as sites) 
afte superimposed on this more continuous, lower-density record. In order to sort 
them out, to distinguish occupational and functional episodes from one another, we 
must record artifacts and features as a continuous phenomenon. 


If om fact at least part of the archacological record ws continuous, and the 
ethnographic (“theoretical”) as well as methodological arguments presented 
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throughout this chapter support that it 1s, then the meaningfulness of predictive 
modeling based on a “‘site”’ vs “not site’’ concept is called into serious doubt. One 
example of why this might be the case will be considered briefly here. In certain 
chapters of this volume it is aygued that it is not enough simply to show that the 
locations of sites are highly correiated with the locations of supposed independent 
environmental variables; one must also show that “‘nonsites”’ (by which the author 
means places that are not sites or areas that do not contain sites) are less strongly or 
are negatively correlated with «hese same variables in order to allow “prediction” 
using such a variable. 


How are such “‘not-sites” to be found? Unfortunately, the author continues, it 
is too expensive to look for them, but fortunately, according to him, we don’t have 
to. Archaeological sites, he contends, are rare phenomena that only occur “about | 
percent of the time.”’ Therefore, if one randomly chooses points at places where 
sites haven't been located through actual survey, it is to be expected that only | out 
of 100 points will actually be sites by chance, and the rest will be “‘not-sites.”’ This 
argument is sometimes broadened further: in one geographic information system 
study } bert knows of, the randomly selected “‘not-site”’ sample consists of areas 2 mi 
on a side, only | percent of which are supposed to contain sites by chance. 


But just where would one have had to undertake a survey in order to think that 
sites only occur | percent of the time? Some people reply to this point by admitting, 
“*Yes, you'll find archaeological materials everywhere you look, but not necessarily 
sites.” And this is the real point: How are sites to be distinguished from tsolated occurrences 
or nonsites or not-sites? By assuming we know that they are only ‘' really” sites | percent 
of the time? By using different (explicit or imphecit) definitions of sites vs whatever 
else in each survey, or even within a single survey? 


Elsewhere Ebert (1986) has argued, at length, that one of the biggest problems 
that archaeology, particularly cultural resource management -directed archaeology, 
has is reliance on an unworkable, insupportable “site” concept. There are, thank- 
fully, theoretically as well as practically valid alternatives to “‘site’”’ approaches. 
These approaches are soundly based in archacological literature and practice and 
are drawing increasing interest from both the archaeological and managerial com- 
munities. We would like, ther. fore, to conclude this chapter by offering an example 
of a methodological direction designed to record the continuous archaeological 
record. We believe that many such methodolc:sical innovations, critically informed 
by both general and middle-range theoretical concepts, will be needed before we 
can learn to predict characteristics of the archaeological record and locations of 
cultural resources accurately. 


Background: Nonsite and Off-Site Arcnaeology 


Recognition of the complexities of the for mation of the archaeological record 
coupled with dissatisfaction with most traditimnal means of recording this record 
has led a number of archaeologists working in ciflerent parts of the world and with 
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different problem onentations to recommend new ways of approaching the spatial 
patterning of surface assemblages. One of the most promising of these involves what 
amounts to a reconsideration of the basic unit of archaeological analysis, which until 
this time has explicitly or implicitly been the site. David Hurst Thomas was one of 
the first to express dissatisfaction with the site concept in the literature, calling for a 
“*nonsite ssmpling’’ (1975:61) approach. Certain sorts of depositional situations and 
certain problem orientations, he argued, make a site sampling approach not only 
““inessential, but even slightly irrelevant” (1975:62), and he suggested an alterna- 
tive survey method in which individual artifacts, features, and other cultural items 
form the minimal operational units. This approach, used during Thomas's Reese 
River Ecological Project, was designed to test archaeologically the consequences of 
Julian Steward’s model of ethnographic settlement patterns of the Great Basin 
Shoshoneans (Steward 1938). If Steward’s model could be shown to describe the 
prehistoric case accurately , Thomas reasoned, then the contention of some anthro- 
pologists that historically observed Shoshonean behavior was due to acculturation 
in the wake of European contact would be disproved. 


In order to study the ways in which “‘members of a single hunter-gatherer 
society moved themselves across the landscape, in a stable yet flexible pattern of 
transhumance”’ (1975:64), Thomas compared the cultural debris left by these people 
in each of a number of “‘microenvironments” or sampling strata in the Reese River 
Valley in Nevada. Locations and characteristics of individual artifacts were 
recorded, and artifact-density statistics were used to analyze some aspects of the 
prehistoric systems represented. Although the relationship between these observa- 
tions and the human behavior that created the data was not explicitly defined, 
Thomas’s work remains a provocative illustration of methods of data collection and 
analysis that are not totally dependent on the site as an analytical unit. 


Bettinger (1977a, 1977b) employed Thomas’s methods of density analysis in a 
similar inquiry into the correspondence between ethnohistorically observed behav- 
ior and the patterning of surface assemblages in eastern California’s Inyo and Mono 
valleys. Although both Thomas and Bettinger advanced sound theoretical reasons 
for their nonsite approaches, it is likely that the nature of the observed archaeologi- 
cal surface record in their study areas was more than alittle responsible for shaping 
their research designs. In much of the arid and semiarid American West, surface 
archaeological remains consist of large expanses of sparsely distributed artifacts and 
features that can only be sorted into discrete sites by means of arbitrary boundary- 
setting criteria. 


Another environment that is archarologically similar to the American West 1s 
the arid belt of East Africa extending southward from Egypt through the Rift 
Valley. At approximately the same time that Thomas and Bettinger were working 
in the Great Basin, archaeologists in Africa were beginning to develop their own 
methods of measuring diffuse artifact distributions. Faced with the sparse and 
probably disturbed artifactual evidence from the Acheulean in Kenya and Tanzania, 
Glynn Isaac and his colleagues approached the archaeological record from a constd- 
eration of natural depositional and preservation processes (Bunn et al. 1980; Isaac 
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1966, 1967, 1978; Isaac and Harris 1978). These studies were onented toward 
assessing the patterning of artifacts within sites and on occupation floors believed at 
the time to be the consequence of single behavioral episodes. Isaac was aiso 
concerned with nonsite distributions of cultural items, however—the “scatter 
between the patches” (Isaac and Harris 1975) that makes up a large proportion of 
the total number of cultural items discovered in large areas of arid East Africa. 


As with the American nonsite strategies, sample quadrats were defined along 
the eastern shore of Lake Turkana, and the locations of individual artifacts and 
features within these sample areas were recorded. Isaac (1981) argues that the 
density and patterning of various artifact types are related to prehistoric mobility 
patterns analogous with those of present-day hunter-gatherers. Going a step 
further toward the reconciliation of nonsite and site-oriented archaeology, Isaac and 
his colleagues have more recently suggested what must be seen as yet another 
alternative unit of analysis, the “‘mini-site”’ (Isaac et al. 1981:105). Although the 
term may be unfortunate, the implication that the remains of many past behavioral 
events or series of events might consist of very small or diffuse assemblages is 
w otthy of consideration. 


Perhaps the most systematically developed approach to understanding the 
meaning of archaeological surface assemblages employing the artifact as an analyti- 
cal unit is Robert Foley’s ‘off-site archaeology” (Foley 1980:39-40). This methodol- 
ogy was the result of Foley’s attempts to compare site locations and the distribu- 
tions of resources in a catchment area or “home range” around a site (Foley 1977). 
Starting with the assumption that resour e usage is distance dependent, Foley 
proposed a model in which a study area would be gridded into squares and the total 
relative resource productivity for each area would be calculated on the basis of 
detailed ecological field studies. Next, given the location of a site of interest, isocals 
or areas with consistent extractive values for that site would be drawn. All those 
areas in which the availability/cost ratio for resources was positive would be 
considered to be likely candidates for the home range for that site (Foley 1977:178). 


Operationalizing such an explicit economic model would, of course, require a 
detailed knowledge not only of all relevant prehistoric ecological parameters but 
also of the locations in space of all sites or localities participating in the cultural 
system of interest. In interpreting the preliminary results of archaeological survey 
undertaken in the Amboseli Basin in Kenya, Foley recognized that artifacts seemed 
to be “distributed ubiquitously across the landscape. In contrast to this, demon- 
strable primary stratified sites are extremely rare”’ (1980:39). This was due, he felt, 
to at least two broad classes of processes: those arising from the patterning of actual 
human behavior in the past, and those created by postdiscard taphonomic, deposi- 
tional, and postdepositional forces working upon the discarded artifacts. Later 
consideration of the formation stages for the archaeological record led Foley to draw 
a number of inferences upon which the necessity for and methodology of off-site 
archaeology were to be based (Foley 1981c:31): 


1. Sites are nodes in a continuous distribution of archaeological materials. 
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2. Home-range behavior provides the theoretical underpinning for continu- 
ous archaeological regional distributions. 

3. The processes of continued occupancy, leading to accumulation of mate- 
nals, and postdepositional mechanics compound the continuous distribution, 
as well as increasing its complexity. 


4. The artifact, the basic unit of an archaeological distribution, can and 
should be used as the unit of regional analysis. 


The methodology employed by Foley in his Amboseli survey was developed to 
ensure the collection of data—artifact locations and the characteristics of these 
artifacts—on items occurring continuously but not uniformly across the landscape. 
The study area was sampled using methods tested by plant ecologists who have 
“the same problem of integrating small analytical units or data objects (plants, 
artefacts) with large survey areas’’ (Foley 1981c:34). The sample areas encompassed 
0.05 percent of the total study area. 


Two basic classes of data were collected in the sample units. First, the natural 
and particularly the preservational and depositional environments were recorded. 
Sediments were classified, and the natural processes acting on them (erosion, 
compaction, topographic effects, vegetation cover, and animal or recent human 
activity) were noted. Next, artifacts were recorded in terms of raw material, size, 
artifact or flake type, platform, cortex, and condition; taxonomies for pottery and 
associated bone were also devised. In addition to the surface survey, a number of 
experiments designed to test the short-term effects of rainfall, erosion, compaction, 
and other taphonomic processes were undertaken. The exact locations of artifacts 
within the sample units were apparently not recorded —a very significant omission 
that, coupled with small sample unit size (5 by 50 m), precludes any but the grossest 
density-based spatial analyses. 


Data on the occurrence of artifacts in the sample units were extrapolated to the 
entire study area, and density contours were drawn. Other contour maps also 
extrapolated from the sampled areas to the total study area depicted densities of raw 
materials and artifact types, proportion of cores to other artifact types, artifact 
length and width, occurrence of retouch and edge damage, and other artifact 
characteristics. Foley's analysis of his Amboseli data, like his earlier work (1977), 
proceeded from a goal of examining humanly important aspects of the environment. 
He attempts to do this by formulating models that predict the areas of most intense 
use by past groups with pastoralist and hunting-gathering adaptive strategies and 
then testing these predictions using artifact density data. 


These pioneering efforts to arrive at congruence between theoretical ideas 
about the formation of the archaeological record and methods of discovering, 
measuring, and analyzing cultural resources inspired two recent experiments with 
adapting nonsite or “distributional” archaeological survey to cultural resource 
management. These experiments are discussed below. 
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Distributional Archaeology: Paths Toward 
Theoretical/Methodological Congruence 


In order to apply the theory-based explanatory framework examined earlier in 
this chapter to the archacological record, it 1s absolutely essential that there be a 
congruence between theory and method. Such approaches as nonsite or off-site 
archaeology, in which the artifact is the unit of discovery and analysis, are certainly 
a step in the nght direction when we are attempting to deal with continuous aspects 
of the archacological record, as discussed above. There are shortcomings in these 
approaches, however. One of the chief problems with recent artifact-onented 
approaches has to do with sampling. If it is the patterning in the continuous 
distribution of archaeological materials that we must measure, then the best way to 
do this is by choosing a relatively large “window” through which to look —by 
surveying for, discovering, measuring, analyzing, and interpreting archacological 
materials over relatively large, contiguous sample units. 

The remainder of this section will describe what one of the authors and his 
colleagues (Ebert et al. 1983) have referred to as distributional archarology. Distribu- 
tional archacology 1s a nonsite-onented approach that yields data that are congru- 
ent with the theoretical concepts of mobility and artifact discard presented above. 
Distributional archaeology has been carried out in two different governmental 
contexts as this volume goes to press. In 1983 the Bureau of Reclamation and the 
National Park Service funded a distributional survey at and around Fontenelle 
Reservoir in southwestern Wyoming, and the Bureau of Land Management Las 
Cruces (New Mexico) District recently conducted a distributional survey near El 
Paso in conpunction with the Navajo-Hopi Land Exchange. 


Unfortunately, no detailed accounts of these surveys have yet been published, 
although a number of papers and reports are available (Ebert 1983a; Ebert et al. 1983; 
Larralde 1984; Wandsnmider and Ebert 1983, 1984; Wandsnider and Larralde 1984). 
These papers have been compiled im a report edited by Drager and Ireland (1986). 
This section will not provide an exhaustive discussion of this methodology but 
rather will summarize some of the main points. Distributional archacology 1s by no 
means fully perfected, and experimentation with similar approaches should be 
encouraged. 

Distributional archaeology was conceived with several major objectives in 
mind. It is oriented toward the relatively complete and continuous survey of 
archacological materials—artifacts and features—over large contiguous areas. 
Large areas relative to the scales of the archacological patterning must be surveyed, 
and their contents analyzed, if we hope to sort out overlapping distributions in the 
continuous archaeological record. The distributional archaeology methodology 
calls for discovery of artifacts and features through intensive surface survey, 
recording of the location of each artifact or feature as a point in space, and consistent 
in-field coding of artifact attributes. All artifacts, including nondiagnostic tools and 
debitage, are recorded. 
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The Seedskadee Project 


Surrey Design 


The Bureau of Reclamation/ National Park Service survey around the Fonte- 
nelle Reservoir was called the Seedskadee Project (Figure 4.9). This survey was 
designed as an experiment to test systematic survey methods for recording the 
continuous archaeological record. The survey design was directed by two major 
propositions: (4) units of analysis and discovery structure the ways in which 
archaeologists think about the nature of the archaeological record and, in fact, what 
is found during fieldwork (Binford and Sabloff 1982); and (6) very little is known 
about what the archaeological record means or what it looks like. For these reasons, 
the units of analysis employed during the survey had to be units with little or no 
meaning already attached. Individual artifacts were chosen as the units of discovery 
ond mapping; attributes of artifacts were chosen as the units of data recording. The 
discovery and recording methods used were carefully designed to minimize biases 
in what was recognized as an artifact, what data were considered to be appropriate 
to record, and how those data were recorded. 
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Figure 4.9. Location of the Seedskadee Project, a distributional (nonsite) archaeological survey 
undertaken by the National Park Service and the Bureau of Reclamation in southwestern Wyoming. 
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A simple random sample of 25 500 by 500 m sample unsts was surveyed during 
the Seedskadee Project (Ebert 1983a). The sample was not stratified by environ- 
mental zones because the zonation present was thought to represent differential 
surface geomorphological processes rather than past natural conditions. Responsi- 
bility for data recovery was delegated to three separate crews. A five-member 
discovery crew was responsible for finding and flagging artifacts and for maintaining 
even ground coverage in precisely controlled 5 m transect intervals. The data- 
recording crew consisted of three individuals who numbered the pinflags marking 
artifacts and recorded artifact attribute data in a format designed for easy computer 
input after the fieldwork phase of the project was completed. The two-person 
mapping crew was responsible for provemience control of artifacts, most of which 
were mapped individually using an electronic distance measuring (EDM) device 
and a prism. In areas where artifact density was very high, mapping of individual 
items was abandoned, and | m grids became the provemience unit. 


When additional artifacts were found by the recording crew, they were flagged 
separately. The distributions of these later finds often resembled the results of 
traditional site surveys in that they tended to be far more clustered than the 
distributions marked by the discovery crew. As a rule, highly visible artifact 
concentrations received more attention than interlying areas, as 1s the case with 
traditional survey methods. The items found by the recording crew often doubled 
or tripled the number of artifacts recorded in a sample unit. 


General Results of the Seedskadee Distributional Surrey 


The end product of these survey procedures is a data base that consists of some 
170,000 coded attmbutes, predominately locational data and lithics descriptors from 
17,000 artifacts. Analysis of the Seedskadee data base, emphasizing the search for 
spatial patterns among attributes, is presently proceeding along lines that will be 
discussed below. Some preliminary impressions gained from the Seedskadee exper- 
iment, however, have immediate implications concerning the appropnateness of 
the approach and the nature of the contributions that it can make to predictive 
attempts: 


1. There were prehistoric artifacts m all environmental zones. They 
occurred in differing (but usually unexpectedly high) densities and in many differ- 
ent kinds of distributions that appear to vary im both spatial configuration and 
content. It seems that the kinds of distributions encountered at Seedskadee would 
confound the usual methods of downg predictive modeling (i.c., defining environ- 
mental parameters for site location) because the data base is gradational in distribu- 
tion and density, rather than made up of discretely bounded “sites.” 


2. The harder one looks, the more one finds. Although this is a simple 
observation, its repercussions for management of archacological resources are 
profound, since RFPs generally emphasize acres surveyed rather than cultural 
resources located per dollar spent. The perception that archacologists have of the 
archacological record is a direct function of the context of discovery: survey 
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interval; time spent on sweeps, on flagging concentrations, or on recording con- 
tents of gnd squares; and external and mternal crew goals and conditions. 

It was also observed that wrfacw and sebuertas are relative, dynamac terms. This 
port ss casily dlustrated m areas hke dunes, where the acts of discovery, mapping, 
and data recording change the surface archacology: artifacts are burned and uncov- 
ered through scuffling and tramphng during the course of the survey self. 
Noncollection survey us often (probably always) destructive of the archacological 
record. Not only does the survey have a direct umpact on the location of artifacts, st 
is hikely that sndirect umpacts, such as alteration of the soul's surface and of 
vegetation, will affect the rates and nature of local natural processes im the future. 


3. Error, variability, and sources of bias mm method and results must be 
evaluated and explained. To address such problems, two control expermments were 
included mm the project to help in the evaluation of data rehabuility. In the first, a 
sample unit was seeded with “pseudoiacts™: nails and washers painted to approx- 
mate the color of the ground and natural lithic materials occurring im the area. Some 
of these wems were distributed in clusters or “sites,” while others were placed 
individually as “isolated occurrences.” These were flagged and recorded by the 
discovery crew, which yielded information about accuracy of the discovery proce- 
dures. Approximately 55 percent of the pseudofacts were recovered by the discov- 
ery crew at a5 m transect spacing, with an additional 10 percent being found by the 
follow-up analysis crew. More interesting, however, were the proportions of clus- 
tered vs isolated pseudofacts found. The discovery crew located 68 percent of the 
clustered artifacts but only 16 percent of the wsolated items (for the analysis crew the 
figures were 12 and 6 percent, respectively; Wandsmder and Ebert 1984). 


In a second methodological experiment, a purposefully manufactured lithic 
assemblage was independently coded by the three principal data recorders. There 
was considerable inconsistency among coders even though they mspected the 
assemblage at the same time under the same conditions. It 1s possible to control for 
such inconsistencies if their extent is known, however, and procedures for dog so 
are discussed at length by Larralde (1984). 


4. With a systematically organized, multicomponent survey team such as the 
three-part Seedskadee crew, portions of the crew can complete thei individual 
tasks at their own speed and under ideal conditions, and this greatly increases the 
yield of actual product (i terms of information) per person-hour worked. In a 
penod of approximately seven 10-person weeks, some 170,000 attributes were 
recorded. This 1s the mformation equivalent of 23000 of the most detailed site 
recording forms m use m the United States. Although the amount of ground 
covered during thes tume (625 ha or 1544.35 acres) us less than for most traditional, 
site-onented surveys, the mformation yield us high. The mtormation-yield argu- 
ment is very umportant when conudering the cost-ctlectiveness of any in-field 
data-collection program. 


The question might be asked, of course, yust what the real “information 
equivalence” us between 170,000 artifact attributes and the data contained on 2-3000 
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detailed site forms. In order to answer this, what those data consist of must be 
explicitly considered. Distributional archaeological data consist of known point- 
locations and characteristics of the physical material: ::2t make up the archacologi- 
cal record. Site data in almost all cases consist of h:ast:ly formed opimons about 
abstract boundaries of supposed past occupations, guesses as to how many artifacts 
might be found within those boundaries if one were to count them, a list of 
diagnostic materials found during a walk around the “site,”’ and the surveyors’ 
enumeration of the cultures that occupied the area in the past and what the 
members of those cultures were doing there (camping, chipping stone, hunting, 
etc.). We would suggest that in most cases “information equivalence” isn’t even 
the nght framework for such a discussion. The difference is information vs abstrac- 
tions. 


5. Even though finding sites is not the point of a distributional survey, the 
results of spatial clustering routines run on the Seedskadee data suggest that the 
distributional survey discovered more “sites” than recent traditional surveys in the 
immediate project area. This is true even if allowance is made for the intensity of 
survey. The Seedskadee survey was 3-6 times as intensive as 15-30 m transect 
interval surveys done recently in the area (Reynolds 1983); our first impression 1s 
that the Seedskadee distributional survey located from 10 to more than 50 times as 
many sites as the traditional surveys did. This means either that linear or sinusondal 
intensity-to-yield models of surface survey results such as that presented by Judge 
(1981) are unwarranted or that we did not reach the hypothetical falloff point even 
at a5 m transect interval. Are even smaller transect intervals necessary in certain 
situations? 


6. Field observation during the Seedskadee Project revealed that the scale of 
patterning of the natural processes that affect the visibility, preservation, and 
integrity cf the archaeological record are of a very local nature. These processes are 
controlled by local topography and other small-scale factors and thus are often 
smaller in scale than culturally caused clusters of artifacts. As discussed above, it 1s 
necessary to factor out the effects of natural depositional and postdepositional 
processes before one can decide what cultural patterning looks like. This means 
that extremely localized, small-scale geomorphological mapping and process mea- 
surements over time may be absolutely necessary before any predictive modeling of 
artifact or site distributions can be done. 


The Navajo-Hopi Land Exchange Project 


Another example of a distributional archaeological survey in which the site 1s 
not the explicit unit of either recording or analysis is the Navajyo-Hopi Land 
Exchange Project survey, conducted by the Bureau of Land Management just west 
of El Paso. This survey is much larger in scale than the Seedskadee Project and 
represents several refinements on the methods used in the earlier survey. 
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The Navajo-Hop: Land Exchange (NHLE) survey was conducted m three 
adjacent survey areas, which together comprise some 16,000 acres (25 ms") of mesa 
top and breaks. Much of the area us covered by a thin sand mantle cxhibsting 
coppice dunes and blowouts. Previous site-onented research m the study area 
recorded sites with adobe pueblo structures as well as many scatters of lthac and 
ceramx materials and solates. The adobe pucblos are typically not visible unless 
they have been disturbed by natural or cultural processes, although them associated 
muddens have high artifact densities and thus high visibility. 


Phase | of the NHLE survey was desmgned primarily to fulfill management 
needs. Its goal was to determine, through a relatively low mtensty transect survey, 


which areas conta dense concentrations of resources, particularly structure- 
associated resources, and should therefore be excluded from the land exchange and 
preserved. Phase | was also expected to sdeatily areas om which the nature of the 
archacological remains did not warrant auteriatec exchason but did require further 
survey, study, and posuble excavation pror to the land exchange. 


During Phase 1, 400 by 400 m and 800 by 800 m sample units (totaling more than 
60 km’) were surveyed at 25 m and S90 m transect mtervals. All artifacts and features 
occurring within | m on cach sade of the surveyors were tabbed for cach transect, and 
densities of materials were calculated along cach transect. These density data were 
analyzed using a clustering techmque mm which areas were examined on the basis of 
whether they contained portable and or nonportable contamers, portable and or 
nonportable mmplements, and low- and or high-volume processing facilities. Pre- 
dictions were then made as to which areas should contam structural remains. An 
independent, structure-onented discovery survey was carned out, and the Phase | 
density analysis was found to have been very successful at predicting which areas 
would contain subsurface structures. 


Two other classes of areas were also wolated during the Phase | survey: those 
with very low densities of cultural resources and those with moderate densities of 
artifacts and features but without associated structural remais. These areas are the 
subject of Phase Il, an miensive survey semilar to that described for the Seedskadee 
Proyect (Camuilh et al. 1988). In thes phase 13 400 by 400 m unuts and five 800 by 800 m 
units were studied using a5 m transect mterval. Individual cultural tems (artefacts, 
features) were the unit of discovery, mapping, and analysis. 


Certain cultural resources, unclhuding unifaces, bufaces, and nmsherds, were 
collected during this phase, and some of the areas with surface features, such as 
firecracked rock and hearths, and some scatters with no features were excavated. 
Generally, however, artifact and feature analysis carmed out during the course of 
Phase I] was done im the field. The Phase Il in-field coding taxonomy was directed 
toward not only the identification of formal tools or diagnostic materials, but 
especially toward identification of lithic production strategies. 


153 











EBERT AND KOHLER 


1M 


Artifact Coding and Analysis 


We need to touch bnefly upon a very important subject — what data need to be 
coded when artifacts are used as the units of discovery and analysis. It 1s often 
suggested that all archacological research takes place in unique situations and that 
each researcher's problems are different, and therefore that no hard-and-fast rules 
can be formulated as to the field methods and analyses archacologists should use. It 
may be that any area containing cultural resources is unique on a very specific 
level—just as the distribution of molecules in Maxwell's glass of water was umque. 
We would suggest, however, that st 1s not the unique aspects of the archacological 
record that are of interest, but rather those aspects that can be compared and 
contrasted from place to place —the general attributes of archacological materials. 


The practice of separating assemblages on the basis of formal attributes of 
diagnostic artifacts and labeling each of these as a different culture type or tradition 
defeats any attempt to recogmize the differentiated portions of human organiza- 
tional systems and thus precludes successful explanation, modeling, and prediction. 
Methods must be found for recognizing different systemic components and their 
overlap. Although some possible directions for this will be discussed below, we do 
not, unfortunately, know at present which general attributes of archacological 
items are important in explanation. 


It 1s possible, however, to describe a general direction that might be followed 
in determining how to code attributes of archacological materials. A human system 
is composed of, among other things, a series of places where things are done. The 
key word here 1s series, and this chapter began with a discussion of the ways in which 
events at each place in the series are important to the operation of the entire system. 
Another set of components in a human system is technological items, which are also 
used in a serial way. Items used at places are sometimes discarded and at other times 
are modified there and used for other functions. Still other times, items are curated 
and taken away to be used at one or probably more different locations. Attributes 
that provide possible clues to the serial nature of technological strategies are, then, 
of major importance in understanding the components of systems. Such attributes 
include not only formal tool designations but also data on the nature of what most 
researchers class as debitage — utilized and unutilized parts of tools, and debris from 
lithic reduction, modification, and manufacturing. 


Analyzing Data from Distributional Archacological Surveys 


It is necessary to establish linkages between the archaeological record and the 
organization of the past human systems that created this record before we can make 
successful predictions about the locations of cultural resources or about their 
meaning, usefulness, or significance in archacological terms. Previous sections of 
this chapter have worked downward through the explanatory framework of 
archaeology presented in Figure 4.1, beginning with higher-level, theoretical ideas 
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about the nature of human settlement mobility systems and middie-range theoret- 
scal sdeas about technological strategses, discard behavior, and the natural torma- 
thon processes that also affect the archacological record. In this section, archacolog:- 
cal method has been considered and suggestions have been made about ways to 
discover and measure the archacological record that are congruent with this 
theoretical framework —nonsite or distributional surveys yielding high-resolution 
spatial and attribute data collected trom artifacts and features, rather than from 
sites. Given such data, however, what should we do with them? 


This ts not an casy question to answer, for it 1s one of the areas on which the 
most concentrated archacological research has yet to be done. During the past 
decade, however, there has been some expermentation in the spatial and content 
analysis of assemblages, primarily intrasite analyses attempting to wolate activity 
areas within sites. Intrasite analysis 1s not exactly the same thing as what we will 
ultumately want to do with distributional archacological data, although the mtrasite 
analy sis literature should suggest some ways m whoch archacological data must be 
analyzed before we can understand patterning im the archacological record pro- 
duced by the action of past systems. 


Carr (1981, 1984) suggests that pror to mtrasite assemblage analysis i 1s 
necessary to differentiate carefully between activity sets (archacological materials 
used together in space and time in the past) and depositional sets (those materials 
that aggregate in the archacological record), since disyunctions between these two 
entities result in clusters of tools or umplements that are not automatically equiva- 
lent to activity areas. Associations in the archacological record may be the result of 
implements having been used together, but they can just as well be a result of 
overlap of activities through time or of natural depositional and postdepositional 


processes that cause polythetic, overlapping depositional sets (Carr 1984:120). The 


archacologist faced with comprehending the mtrasite archacological record must, 
according to Carr, use these depositional sets to detine (4) the spatial lumuits of 


activity areas and (+) the organization of artifact types into tool kits. These have 
been the goals of a number of archacologists using various methods of mtrasite 
analy sis. 


Wandsmider and Larralde (1984) break down contemporary mtrasite archaco- 
logical assemblage analysis methods into three basic types. The first of these was 
developed by Robert Whallon at the University of Michigan, who became one of 
the pioneers of intrasite spatial analysis with the development of his dimensional 
analysis of variance (Whallon 1973) and comparison of its results with nearest 
neighbor analysis (Whaidon 1974). Whallon'’s more recent work (1984) uses a more 
comprehensive spatial method called “unconstrained clustering.” Unconstramed 
clustering identifies areas within sites that have similar assemblages by (4) con- 
structing density maps for each artifact type, (+) calculating the relative propor tion 
that each artifact type contributes to the assemblage at pots across the site, («) 
identifying simular assemblage iypes, (4) mapping the cluster members and examin- 
ing their distribution, and (¢) reconstructing the activities that occurred on the site 
in light of spatial patterns sdentified ethnoarchacologically. Carr (1984) has crite 
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cued Whallon’s unconstrained clustering method because mt assumes that activity 
sets are the result of single episodes of functionally semilar behavior (1-c., they are 
monother [Carr 1984:136-137]), that tools are always discarded expediently, that 
no dusturting postdepostional processes occur, and that activity areas do not 
overlap but have sharp borders. These assumptions are probably unfounded m 
most if not all cases of human behavior and archacological record formation. 


Cart (1981, 1984) proposes new techmues that he feels overcome some of the 
problems with Whallon’s and other spatial analyses approaches; these techmaques 
describe the distribution of cach artifact type within the site. Although Carr's 
process ts too complex to be described mm detaid here (see Carr 1981), he uses pot 
distmbutions rather than grid cell counts and employs digital filtering, Founer 
analy sis, spectral analysis, and histogram equalization, techniques m common use m 
the processing of umaged remote sensor data. Such technological means may hold 
great promise for archacological pattern recognition. 

A third class of approaches to mtrasite spatial organization us exemplified by 
the work of Kentigh and Ammerman (1982) and Sumek and Larick (1983). Kentigh 
and Ammerman’s heuristic approach to spatial analysis combines “the sophistica- 
tron of intuitive approaches with the information processing capacity and system- 
atec benefits of quantitative treatments” (1982:31). This method divides artifacts 
mto types and subjects the distmbutions of cach type across space to a é-means 
nonhserarchical divisive cluster analysis. The archacologist, using mternalized 
knowledge about the scales and nature of archacological formation processes, 
decides mtuitively upon a cutoff pomt for the number of clusters of cach type 
formed and then recombunes the clusters of different types mto a series of overlap- 
pong clusters that presumably represent activity areas. 


While all three approaches to mtrasae spatial analysis hold promise, they are 
all also directed toward specific reconstruction of the things that went on within 
sites. Betore these methods can be apphed to the continuous archacological record 
across landscapes, the scale of apphcation must be mcreased tar beyond that 
discussed by these authors. Wandsmder and Larralde (1984) have also pointed out 
that cach of these three approaches solves only some of the problems of spatial 
analyses: Kentygh and Ammerman and Simek and Lanck only sdentity spatial 
clusters; Carr describes and compares the spatial orgamization of artifacts; and 
Whallon describes and compares assemblage content. W andsmder and 1 arralde call 
for methods thes permit the description and comparuon of assemblages both m 
terms of conterit and mm terms of spatial organization of structure, and they suggest 
a five-part method building upon archacological theory as well as inductive statisti- 
cal procedures: 


|. Development of an artifact taxonomy. This could proceed m several ways: 
along deductive lines, based on ethnographic and ethnoarchacological mtormation; 
on the basis of experments im lithic manufacture that identify the stages of artutact 
production and sequential use; on the basis of information about the mechanics of 
artifact functran (edge angles, etc.) or through purely statutical and mductive 
clustering algormthms. 
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2. The sdeatifcatson of panal emt. Directed at defining the boundanes of 
assemblages, this stage of analysis might be based on mtcractive, heurstac tech- 
mques, such as those suggested by Kun’ :gh and Ammerman ( 1982) and Sumck and 
Lanck (1983), which sncorporate not only behavioral knowledge but also mforma- 
tron on depositional and postdepositional processes. Perhaps the best way to define 
small-scale patterning mm natural formation processes is through the use of remote 
sensor data, since these data could be machine processed at the same teme as 
distributional imformation on clusters. 


Other sources of information to be incorporated at this stage may come from a 
consderation of the characteristic shapes and sizes of the spatial patterns of human 
behavior. During fieldwork with the Nunamuut Eskimo, Benford (1978) sdentified a 
number of different zones of activities and artifact discard, including drop zones, 
toss zones, hearth-centered activity areas, and structure of tent scatters. Recogm- 
tion of these patterns within overlapping distributions might be accomplished 
mechanically by varying grd trame sizes during analy ss or by constructing shape- 
recognition filters. The larger zone types might be appropriate for discernung the 
boundanes of assemblages. 

3. Content dewription and analy. Once assemblages have been detined, ther 
contents might be described on the basis of the taxonor » of taxonommes devised mn 
stage |, by means of such techmques as principal components analy ss, whech os used 
to compare the composition of diflerent assemblages. Recently , Kohler and Blhiaman 
(1987) have proposed using multiple linear regression to estemate the absolute and 
relative contributions of several different penods of use and deposition to the total 
archacologycal deposits at ceramc-contamung wtes im the Dolores River Valley m 
Colorado. The techmque us semilar to Stahle and Dunn's (1982) use of multuple 
lineat regression to estumate the contributions of various stages of bitacial reduction 
m a maxed collection. Both of these applications are aspatial, but they contribute to 
an understanding of the composition of mixed collections m terms of predetined 
conststuents and might be used to sort out overlapping activity sets. 


4. Structural dewraption and analy. Whule stage } us durected toward descrb- 
ing the contents of assemblages, stage 4 provides a description of the spatial 
orgamzation within assemblages. This 1s where the smaller-scale “zones” charactert- 
stec of human activity mmght be recogmzed within the overall assemblage compos- 
tion through digital filtering. It might also be possible to use small, smple clusters 
of materials that seem to result trom single ducrete activity epusodes to desgn 
“filters” to pass through larger, denser, and probably more composite artifact 
distributions. Smaller, sengle-occupation clusters might be expected to exhibit 
more central diustmbutional tendencies and higher correlations between artefact 
types wm space than the larger, more composite distributions. Other filters might 
consist of sample frames of varying size that could be passed through compiles 
distributions m the manner of Whallon's diumensonal analyse of vanance (1973, 
1974). Wandsmder and Larralde (1984) also suggest that the spatial organization of 


the different principal components might be imspected. 
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5. Pattern dissection would constitute the last stage mm distnbutional spatial 
analysis, according to Wandsnider and Larralde (1984). Larger and more complex 
assemblages (i.c., things found together —-depositional sets in Carr's terminology) 
are undoubtedly the result of the complete or partial overlapping of many behav- 
soral episodes. It may be possible to separate these episodes from one another, and 
certainly this 1s a necessary step in comprehending the complex systemix mecha- 
misms that resulted in the archacological record at any place. 


It ss quite possible that some of the procedures suggested by Wandsnider and 
Larralde (1984) might be implemented in different orders or as combined steps 
rather than separately. Some of them may also be unnecessary —for instance, stage 
2, in which the boundaries of assemblages are sought. We may never really see 
bounded assemblages in the continuous, overlapping archacological record but 
rather may be looking at portions of these through the “windows” provided by our 
sample units, by our survey area boundaries, or by natural surface processes. 


The Solution: Dedicated Research Using Distributional Data 


It 1s clear from the foregoing that two general things can be said for archacolog- 
ical spatial analysis. The first is that archaeologists do not quite know how to do it 
yet, at least in ways that are congruent with the higher-level and middle-range 
theoretical ideas that we have about the formation prov: ser of the archaeological 
record. The second is that spatial analyses directed toward understanding the 
complex, composite archaeological record will probably combine modern tech- 
niques such as digital image processing —some of which are just now being devel- 
oped to the point that they will be useful to archaeology — and deductive reasoning 
in acomplex interactive process. This process will draw upon both archaeological 
and ecosystemic theory to arrive at successful archacological explanation and thus 
prediction. Such archaeological analy sis 1s ¢ *esently a goal rather than reality, a goal 
toward which both management and archacological interests should be energeti- 
cally directed. 


SUMMARY 


This chapter has been concerned with the method and theory of using 
anthropological explanation to predict things about the organization of past human 
systems as well as about the archaeological record. The explanatory process illus- 
trated in Figure 4.1 involves the advancing of model; that are used as the basis of 
prediction. While at first it might seem overblown to introduce anthropological 
explanation into a discussion of “practical” archaeological prediction, it has been 
argued and illustrated here that it is only in the context of explanation and 
explanatory modeling that archacologists and managers can hope to make truly 
successful predictions of the locations and other characteristics of the materials chat 
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make up the archacological record. This is so Decause the things that determine the 
locations of the materials that make up the archacological record are not static, 
unchanging properties of the environment that can be measured easily from 
topographic or environmental maps. 

The archaeological record is not the same thing, or even the same kind of 
thing, as the way that past individual: dealt with their environment and the 
locations at which they dropped artifacts. The data that archaeologists collect, 
analyze, and attach significance to are the product of long-term use of the land- 
scape. Large numbers of prople, orgaruzed in different ways, have serially located 
their activities across this landscape, manufacturing and differentially discarding 
artifacts in ways that changed as the landscapes changed with paleoclimatic flux, 
and as the mobility and technolosjcal strategies within their cultural systems 
changed. 

This chapter has advanced a general model of human subsistence and mobility 
strategies that vary along a continuum of intensitication from a generalist, foraging 
strategy through a specialized, collecting organization. This model is not intended to 
represent the “whole truth” about past systems. Nonetheless, it provides a basis 
for making predictions, and if these predictions prove to be consistent with the 
observations about the archaeological record, this would tend to support the 
usefulness of the model. If the predictions made on the basis of this model are not 
supported by observations o! the archaeological record, then an alternative model 
or models should be devised. This may be one of the most important problems 
currently facing archaeologists today —to arrive at and attempt to confirm models 
concerning the operatior of past systems. This task lends significance to the 
discevery and conservatw n of archaeological materials, and it is therefore the reason 
why cultural resources should be managed and preserved. 


Before archaeological data can be called upon to support or negate any 
explanatory model, however, the archaeolog’ « must take into account the things 
that alter or otherwise affect the ways that we see the materials that past human 
systems discarded. These factors are also illustrated in Figure 4.1 at the beginning of 
this chapter. 


There are two basic types of things that happen to the objects that human 
systems culturally modify and then discard or abandon. The first of these lies in the 
realm of natural pvocesses, which incorporate discarded materials into the earth's 
surface and subsurface deposits and which act to preserve, rearrange, or destroy 
these materials. Natural processes also make archacological material visible to 
archaeologists and managers, so that we know they are there and need to be 
conserved and studied. 


The other factor affecting archaeological materials is that they are discovered, 
measured, analyzed, and interpreted by archacologists. This is the realm of archaco- 
logical methodology. It has been suggested in this chapter that, in order to be 
successful at discovering those things we need to know about the archaeological 
record in order to be able to predict its locations and characteristics (and thus its 
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significance in management terms), archaeological methods =:ust be compatible 
with the theories we have about the ways in which materials were discarded by past 
human systems. It has also been argued that this is not presently, at least very often, 
the case, and that we may need to alter significantly the ways in which we deal with 
the archaeological record today as archaeologists and managers. 


Another very important area of archaeological methodology concerns the 
natural phenomena that we measure and compare with the distribution of archaeo- 
logical materials as they presently exist. These variables must be chosen in accord- 
ance with our ideas about the organization of past human systems if they are to be 
useful in predicting the characteristics of the archaeological record. The things that 
biologists, ecologists, and the people who make topographic maps have measured 
may not be the best variables to use if we wish to elucidate the organization of past 
systems; we have discussed the alternative of using ecosystemic variables in 
archaeological explanation rather than relying on specific resources, species, land- 
forms, or other convenient proxy “‘indicators.”’ In order to use ecosystemic varia- 
bles in our modeling and predictions, we may have to do most of the measurement 
work ourselves. 


Many archaeologists may disagree with the models of past systems organiza- 
tion that have been advanced in this chapter and with our suggestions about the 
relationships between these models and ecosystems variables and about the conse- 
quences of these relationships for the archaeological record. That is good, for it 
gives us all something to think about and to try to build upon and to alter so that it 
“fits” the archaeological record that we discover and deal with. There are few 
archaeologists, however, who will argue that we do not need to model past systems 
organization to predict the locations and nature of the archaeological record that we 
are all concerned with conserving. 


This chapter, theretore, should not be chought of as advancing any particular 
model or models that wall best typify what human systems were like in the past, or 
how they were related to the world in general. The theme of this chapter 1s instead 
that it will not be easy to model the ways that the archaeological record came about 
or to predict where archaeological materials in general, or specific sorts of significant 
archaeological materials, will be found. Claims that predictive modeling 1s easy or 
that a particular model is highly successful should be carefully examined in light of 
this chapter. Does the model in question consider past systems organization? Are 
empirical “predictive models” of general utility not only in predicting the locations 
of archaeological materials but in explaining the systemic mechanisms behind 
them? If not, they are likely not to be generally successful and applicable, for 
mechanisms must be elucidated before their consequences can be determined. 


We are presently at a very crucial point in archaeological science and in the 
practice of cultural resource management. Management requires that we be able to 
predict the locations and significance of archaeological resources, and archaeology 
must discover how to do this. Fulfilling this goal will require concentrated and 
dedicated research that may not, at all times, appear to be totally directed toward 
the pursuit cf simply identifying and conserving sites. Management must be 
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patient and supportive of the genuine pursuit of archacological explanation, for it is 
only through explanation that we can understand anything about the past through 
the archaeological record. Archaeological prediction is a new frontier, and all aspects 
of it must be justified and proven in explanatory terms. 
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Let us assume that the d.stribution of sites per quadrat for each sampling 
population as a whole is also positively skewed. The Central Limr Theorem (Hayes 
and Winkler 1971:292) states that the distribution of sampling means for samples 
drawn from this population will still approximate normality; the average of the 
sampling means will equal the population means, and the standard deviation of the 
sample means will equal o YN if these are based on repeated random samples of 
sufficiently large size. Cochran (1977:42) suggests as a rule of thumb that sample size 
should be greater than 25G ;?, where G, 1s Fisher’s measure of skewness. Table 6.1 
shows that for the Utah survey results piotted in Figure 6.2, only the Circle Chiff 
region survey meets this criterion, with a 10 percent sampling fraction. Only by 
combining the San Ratael Swell survey area with the Circle Cliff survey area was the 
researcher able to obtain an adequate sample size for the San Ratael Swell. From an 
anthropological standpoint, this 1s a questionable practice at best. 


Several archaeologists have noted that adherence to ‘ochran’s rule wall usually 
require a very large sample size (e.g., Thomas 1975:68-70). Nance (1983:303) has 
suggested another method, based on Monte Carlo simulation, in which a hypothet- 
ical population distribution 1s created on the basis of the sample data. Repeated 
sample selection from this population then allows for a thorough examination of 
skewness. 


The task of selecting an appropriate sample size trom a skewed population 
distribution becomes even more difficult when the subject of interest shifts from 
the sample unit to the site. In this case only those units that contain sites are of 
importance. Thus, it 1s not the total number of sample units but the total number of 
sample units minus the number of sample units without sites that will determine the 
size of the survey. The problem then ts to estimate how many units will have to be 
surveyed before an adequate number of clusters 1s obtained. The work of anumber 
of archaeologists and human geographers suggests that fitting of discrete probabil- 
ity distributions to supposed settlement distributions may be a useful approach to 
this problem (e.g., Clarke 1977; Clift and Ord 1973; Dacey 1964; Harvey 1967; 
Hodder 1977; Hodder and Orton 1976; Hudson 1969; King 1969; Wood 1971). For 


TABLE 6.1. 


Skewness values and sample sizes by study tract 





Reqursate Sample S120 
Study Tract Shtemnes: Valu n > 256 3° Actual Sample Size 


Total 10 percent sample 


Circle Cliffs and San Ratae! Swell 1.97 97 OK 
Circle Clitts 

10 percent sample 0.82 “17 bit) 
San Ratacl Swell 

10 percent sample 2.47 153 os 
White Canyon 2.47 153 7 





Adapted trom Typps 1984:132, Table # 
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example, Wood (1971), Hodder and Orton (1976), and Nance (1983) argue that, 
because sites are often rare and clustered events, the pattern of site densities in 
many regions can be described reasonably well by the negative binomial distnbu- 
tion. The negative binomial 1s described by two parameters, the anthmetic mean 
and a positive exponent (4; see also Chapter 5). For a region about to be surveyed, 
an archaeologist can arrive at an estimate of average site density by using the results 
of surveys in nearby regions with similar environments. The positive exponent, é, 
can be estimated in a variety of ways (see Bliss 1°53). The usual approach in 
archaeology is first to arrive at some estimate of the sample vanance and then to 
calculate & using the equation 


k= x? (5? - x) 
where x and s? are the mean and variance of the sample (Nance 1983; Wood 1971). 
Once the parameters are defined, the probabilities of obtaining a certain number of 


sites per unit can be calculated in a straightforward manner using the probability 
generating function 


(& +Xx- 1)! R* 
P(x) = ) forx = 0,1,2,... 
xi(t-1)! 





= 0 otherwise 


where R =p g = (mk +m). 





Nance (1983:334-335) has provided an example of how the negative binomial 
distribution can be used to determine how many units should be surveyed. Using 
sample estimates for x and s? from a simple random sample survey of 31 quadrats in 
the Upper Hat Creek region of British Columbia, Nance calculated the parameters 
of the negative binomial distribution, x and 4. For a given quadrat size, then, he 
could predict the number of “‘empty” quadrats that would be surveyed, the 
number containing one site, the number containing two sites, and so on. He found 
that the negative binomial distribution fit the observed site distribution very 
closely (Nance 1983:335, Tables 8.8 and 8.9). This fit was expected since the 
predictions were being compared with the data from which they were derived, but 
the results indicate the potential of this and other probability distribution functions 
for indicating approximately how many empty units are likely to be found for a 
given sample size. By extension, if a reasonable estimate of the probability that a 
unit will not contain a site can be calculated, we can also determine the number of 
units that would have to be surveyed in order to obtain a specific number of units 
containing sites. For example, if the probability of any survey unit being empty ts 
0.50, then in order to obtain 30 units that contain sites we would need to survey 





30 = n - (number of empty units) 

30 = n - (n)( probability of an empty unit) 
30 =n - (n)(0.50) 

OO =n 
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Nance (i983:338) has pointed out that the magnitude of the empty unit 
probiem 1s likely to vary widely from region to region. The problem will be even 
worse when interest focuses on specific site types as opposed to sites in general. At 
this point we have two choices. We can either adopt very large sampling fractions or 
try to reduce the spatial heterogeneity exhibited by most site distributions. 


The primary means by which archaeologists have attempted to reduce hetero- 
geneity in site distributions 1s through stratification of the sample universe. Often 
archaeologists divide a sample universe on the basis of criteria that they believe may 
have influenced site location or that they believe can serve as a proxy for such 
influence. Common criteria include soil type, vegetation zone, physiographic unit, 
or any combination of the above. In many instances the resulting areas are simply 
viewed as separate sample universes. For example, Thomas (1975:65) divided the 
Reese River region into three units on the basis of biotic communities, and the 
resulting subdivisions were viewed as separate sample universes. In order to draw a 
10 percent sample of the entire region, Thomas actually selected 10 percent of the 
sample units in each sampling domain by means of a separate simple random 
sampling procedure. 


The main advantzze of this approach 1s that it ensures that all regions get 
proportionally equal coverage. Further, because simple random sampling was 
conducted in each region, parameter estimates can be computed for each stratum 
using formulas designed for such sampling. If interest focuses on estimates for the 
entire sampling universe (i.e., the areas encompassed by all strata combined), 
however, then computing these estimates 1s somewhat more involved. For exam- 
ple, to estimate the standard error of the sample mean derived from a simple 
random sample, the following formula 1s used: 


SE = —— Y1-wN 
Vv ‘ 


where SE 1s the standard error, s is the standard deviation of the sample, n 1s the 
sample size, and N is the size of the population. The standard error of the sample 
mean derived from a stratified random sample is calculated as 











SE 


~" “strat 





_ {ent 
ne 


where SE 1s the standard error of the stratified sample, x, 1s the number of cases 
chosen from Stratum, s, 1s the standard deviation in Stratum 4, # ts the total number 
of cases chosen, and N 1s the total number of cases in the population. 


The standard error is clearly easier to calculate for simple random samples than 
for stratified random samples. The temptation is to make the assumption that the 
variability within and bet ween strata is approximately the same and thus proceed 
with calculations as if the sample were a simple random one. The problem with this 
approach ts that each variable being measured may be characterized by different 
levels of variability in the strata and different degrees of correlation with the criteria 
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used to create the strata (Dixon and Leaci 1978:17). The net result for most 
variables is that the standard error, as computed by the simple random sample 
formula, 1s overestimated. This 1s true even if the same sampling fraction ts used for 
each stratum. 


The stratification approach described above is best surted to relatively large 
areas for which our information about site location 1s limited. Many Bureau of Land 
Management Class I] coal lease inventories in the Rocky Mountains fit this descrnp- 
tion. These management-defined universes can cover well more than 100,000 acres 
and contain portions of several nver basins. While at the outset archacologists may 
not he in a position to define strata that covary with site distributions, they may be 
able o suggest that each major river basin could have encompassed a separate 
settlement-subsistence system. Failure to divide the region into natural units could 
lead to oversampling in some regions and undersampling in others and thus to 
rather poor parameter estimates. 


An alternative to this type of stratification 1s systematic sampling. In the latter 
design, survey units are selected at set intervals, with the first unit usually being 
chosen by a random process. Several experiments with archaeological data have 
shown that systematic sampling can lead to relatively precise parameter estimates 
(Judge et al. 1975; S. Plog 1976; Sanders et al. 1979). The main disadvantage of 
systematic sampling 1s that the approach 1s lable to miss patterns in the underlying 
distribution that exhibit periodicity. Statistically, a systematic sampling design 1s 
somewhat more difficult to evaluate than a random design because bias can only be 
estimated (Cochran 1977; Read 1975). 








Discussions of sample stratification usually do not refer to definition of separate 
universes. Generally, stratification means subdividing a sample universe into two or 
more strata and then selecting different proportions of each stratum tor observa- 
tion. When the population exhibits uneven spatial variability, as in the case of 
clustered elements, such as sites, an areal stratification scheme that samples the 
strata in proportion to their estimated variance will, if done correctly, lead to more 
precise parameter estimates than simple random sampling, systematic sampling, or 
stratified sampling with proportional allocation (Cochran 1977:99-103). Let us 
assume, for example, that a region consists of two vegetation zones, 100 km? of 
pinon-juniper forest and 100 km? of sagebrush. F urther, let the population value for 
site density in the pinon-juniper zone be four sites per square kilometer with a 
variance of three, and the site density in the sagebrush zone be two sites per square 
kilometer with a variance of 0.75. A 10 percent sample of the 200 km? region using | 
km? survey units would result in the survey of 20 units. Under a simple random 
sampling approach, each unit selected has a 50-50 chance of being located in the 
pinon-juniper forest and a 50-50 chance of being in the sagebrush zone. Using a 
binomial Jistribution we can calculate the probability of selecting a specitied 
number ot sample units in one of these zones as 


P(r) = (*) rq 
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where P(r) 1s the probability of selecting r survey units, N 1s the total number of 
survey units selected, p is the probability of selecting a survey unit in the zone in 
question, and ¢ equals I-p. 


Table 6.2 lists the probabilities of selecting exactly 0, 1, 2, . . . 20 units in one of 
the vegetation zenes. The most likely outcome 1s that of obtaining 10 survey units 
in each zone, which will occur approximately 17 percent of the ume. The chances of 
obtaining distributions of 9-11, 8-12, or 7-13 are relatively good, with the 7-13 
distribution occurring about 15 percent of the tame. While it ts true that over many 
samples a relatively even split can be expected, for any one sample there is a fairly 
good chance that one zone will be overrepresented and the other underrepre- 
sented. Given the population values, a simple random sample will lead to rather 
imprecise estimates. That 1s, sample estimates of the population values are likely to 
fluctuate very widely and thus to be associated with large standard errors. 

Sampling cach zone proportionally will not greatly affect this situation. In our 
example, if we were to treat each zone equally, exactly 10 units in each would be 
surveyed. For the sagebrush zone this might be sufficient, but given the large 
variance in the pinon-juniper zone such an approach would still lead to rather 
imprecise estimates. In this situation what we really want to do ts to survey more 
units in the pinon-juniper zone than in the sagebrush zone. How many more? That 


TABLE 6.2. 
Probability of selecting a specified number of survey units m particular zone under simple random 
sampling, p « 0.50 
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depends on the vanance of the sample mean and the cost of taking the sample. 
Cochran (1977:96) defines one function for computing cost as 


=(= S 
cost =C =, + 2 (pap 


where c;, 1s the cost per unit in Stratum 4/, aj 15 the number of units observed in 
Stratum +, and, represents an overhead cost. In archaeology, costs per unit would 
include such items as recording time and travel me (often the latter is represented 
mathematically as 

7 


where 7, is the travel cost per unit). The objective, then, 1s to minimize cost for a 
specified variance of the stratum’s sample mean or to minimize the variance of the 
sample mean tor a specified cost. 


If one ts not in a position to estimate cost and is willing to assume that cost per 
unit 1s the same in all strata, then determining optimum allocation reduces to the 
equation 


NBSp 


z NpSp 


mp =m 





where #j, equals the number of cases to be selected in Stratum 4, n refers to the total 
sample size, Nj equals the number of potential cases in Stratum /, and Sy, is the 
variance of the sample mean in Stratum + (Cochran 1977:98). This allocation 1s often 
referred to as Neyman’s allocation (Neyman 1934). For our previous example, using 
Neyman’s allocation we would obtain the following results: 


20(100 x 3) 
(100 = 3) + (100 x 0.75) 





" pinon-juniper 


20 100 = 0.75) 


n = = 4 
desert scrub (100 x 3) + (100 x 0.75) 





The port here is that if sample size and sample fraction are relatively small, 
then use of prior knowledge about the nature of the phenomenon to be modeled 
may be the best way to obtain the precise estimates needed for modeling. The use of 
such mnformation in archaeological modeling has been rather limited, perhaps 
because many archacologists believe that they are not in a position to offer even 
good guesses as to the underlying population values. 


One approach to circumvent this problem is to perform a pilot study. For 
instance, if we were conducting a 10 percent sample survey of a national forest for 
the purposes of estimating site density, one strategy would be to select a 10 percent 
simple random sample of predefined areal grid units. This approach, while perhaps 
meeting the assumptions of sampling theory, many times leads to very poor results. 
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Estimates are often not very precise, and one 1s lett with the feeling that for all the 
ngor we have still not iearned very much. A better approach might be to assume 
that site locations covary with certain mappable features (¢.g., soils or landforms). 
This assumption could be tested by some type of probing or purposive survey (see 
next section) and or by a relatively small, simple random sample survey. Based on 
the results of this survey, specific criterna could be defined that would lead to a 
useful stratification of the region. 


If a small simple random sample survey had been conducted, then the sample 
could be poststratfied; that 1s, each of the units surveyed dunng the pilot simple 
randem sample could be reclassified into one of the newly defined strata. Cochran 
(1977:134) notes that poststratification 1s almost as precise as proportional stratified 
sampling in providing parameter estimates as long as the samples in cach stratum 
are reasonably large (say, more than 20) and the effects of errors in the stratum 
weights can be ignored. Basically, care must be taken to ensure that the tinal sample 
matches the population in important respects. If, for example, access to survey was 
denied on private land along river bottoms, the sampling trame might have 
excluded a high proportion of a certain site type. Simply giving added weight to the 
sites of that type that were included in the survey may not improve the sample's 
estimate for the density of that site type; indeed, it may make it worse (Dixon and 
Leach 1978:21). 


If we can justify poststratifying the sampling universe, then we can use the 
variance estimates for cach stratum to determine the optimal allocation of cases for 
the second stratified random sample. Chances are extremely good that even though 
the parameter estimates of the stratified random sample would be based on a smaller 
number of cases (that 1s, assuming that the pilot study and the stratified random 
sample survey together covered 10 percent of the region) the gains made by 
stratifying the region would still lead to more precise estimates than those based on 
a single 10 percent simple random sample. 


Asafinal note, we want to point out one more serous problem with using data 
derived from stratified random sampling to develop predictive models. Generally, 
when a multivariate pattern-recognition model 1s developed some type of commer- 
cial software 1s used. All statistical software packages of which we are aware assume 
simple random samples — that 1s, the variance-covariance matrices are computed as 
if the data were obtained through a simple random sampling procedure. If stratified 
random sampling was used instead, then the matrices will be computed incorrectly. 
The statistical ramifications of this error are not well understood, although it 1s clear 
that the variances will be overestimated. Perhaps the best approach to this problem 
is to write a simple program to compute the matrices correctly and then use these 
matrices as input to the desired algonthm. 


Purposive Selection 


One of the main objectives of collecting new data for predictive modeling ts to 
make certain that no magnet site is missed. Many such sites will have been recorded 
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prior to the survey or will at least be known te local informants. In this situation ali 
that need be done 1s to verify the stie’s location and the nature of its surface 
assemblage and environmental context. In cases where there 1s reason to believe 
that not all magnet sites are known, survey strategies that maximize the chances of 
finding sites in this category must be designed. There are two options. When there 
1s evidence that important centers were distributed according to some predictable 
feature of the natural landscape, such as at a regular interval along a major river or at 
the confluence of mayor watercourses, specific areas can be picked to survey. A 
second approach, which 1s especially useful in regions occupied by complex socie- 
ties, is to use Some type of remote sensing information. Because regional centers 
tend to be the largest znd raost complex sites in a given area, they can often be 
detected on aerial photographs (see Chapter 9). Another technique that enables the 
archaeologist to cover extensive ground areas in short periods of time 1s aerial 
survey from a small-engine aircraft or a helicopter. In this regard, Rogge and 
Luncoln’s comments concerning the Tucson Aqueduct surveys (described in Chap- 
ter 3) are partscularly appropriate. 


Our Tucson Aqueduct case indicates that we did learn a great deal with cach new survey 
but umphes that our predictive models were not particularly robust. Newher did 
conducting the surveys exactly “by the book” ensure meaningful mput mto our 
planning process If we were to start the Tucson Aqueduct senes of surveys over 
today with the 20-20 vision of hindsight, we might decide to spend a few days with a 
hehcopter looking for plattorm mounds and do nothing more until a route was selected 
| 1984-19). 


The two approaches are not mutually exclusive. Indeed, in one of the most 
intensive archaeological surface surveys ever conducted, Millon (1972:11-12) had 
the entire confines of the city of Teotihuacan photogrammetrically mapped to 
reveal low-lying mounds, which are often the remains of architectural features. The 
maps were then used to guide subsequent fieldwork. 


Research designs can incorporate both purposive selection and probability 
sampling. During the Tucson Aqueduct surveys, for example, had the Bureau of 
Reclamation conducted a helicopter survey and found the three platform mounds, a 
stratified random sample survey could have been conducted. Three sampling strata 
consisting of arbitrary 10 by 10 km grids centered over each platform mound and a 
fourth stratum representing the remainder of the survey universe could have been 
defined, with the surveyors covering relatively high sampling fractions in each 
Hohokam community stratum and a much lower fraction of the remaining region. 


Finally, it 1s important to point out that even with the best of sample survey 
designs, magnet sites will still be missed. Some have argued that this 1s exactly why 
sample surveys and predictive models should not be used. In an absolute sense, 
hese critics are right; present models make more mistakes (especially gross errors) 
than anyone is willing to accept. But blind 100-percent surveys are not necessarily 
the answer. Complete inventory surveys that have no theoretical foundation often 
end up adding little to our understanding of prehistory. Further, depending on the 
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field methods (1.¢., crew spacing, recording techmique, etc.), “100-percent™ surveys 
can casily miss all types of sites, including magnet sites. In short, regardless of 
whether the sampling traction 1s 10 percent or 100 percent, there 1s no substitute for 
a well-thought-out survey design that 1s grounded in some theoretical foundation. 


Depositional and Postdepositional Processes 


The tinal class of data needed for the creation of a predictive model concerns 
the processes affecting site detection and site survivability. From a research per- 
spective, it is important to be able to predict areas where sites probably were 
located but where evidence of past activities has been destroved by natural 
processes. While negative evidence may not be very helpful in substantiating 
hypotheses about settlement location, proper geomorphic interpretation may be 
cntical if we are to avosd incorrect reyectson of a hypothesis because of the lack of 
cultural remains. From a management perspective, it may be less critical to model 
site destruction that has resulted trom natural processes, but it 1s still necessary to 
model locations of sites that are intact but not visible on the surtace. Burned sites are 
perhaps the land manager's worst mghtmare. Often they arc not found mm the course 
of usual cultural resource studies and are only detected after construction or 
development has begun. The mitigation of adverse effects on buned sites often 
ends up costing much more than the expenses of the archaeology alone. 


To find burned sites che first step ts to detect and trace palec land surfaces 
suitable for habitation. This task properly falls into the field of geomorphology. 
While archaeclogists have worked with geomorphologists for many vears (e.g., 
Butzer 197), 1982; Davidson and Shackley 1976; Hassan 1979; Haynes 1968; Haynes 
and 4,.oy.no 196%, Jacobsen and Adams '958; Martin and Klein 1984; Saucier 1974), 
this working relatronshup by and large has not been transferred into the area of 
predictive modeling. Geomorphic fieldwerk should ideally precede at least one 
stage of archacological fieldwork. The results of the geomorphic analyses are often 
presented as maps of paleo land surtaces that specify areas where burned sites are 
likely to be tound. If such studies were carned out in conyunction with archacologi- 
cal surveys, areas designated by the geomorphologist could be examined with 
subsurtace tests. 


The issue of subsurtace testing on surveys has recently received considerable 
attention (e.g., Krakker et al. 1983; Lightfoot 1986; \icManamon 1984; Nance and 
Ball 1986; Wobst 1983). Most of this interest stems from research in forested areas 
where the ground surtace us obscured. In these situations visual inspection of the 
surface greatly underestimates the numbers of sites and leads to highly skewed 
locational patterns. It these tactors are not taken into account, then statements 
about settlement patterns that implicitly assume that the observed sites are 
representative of site locations in general are likely to be highly maccurate. 


Thus far most research on discov ering busied sites has focused on sites that he 
on or near the surtace. The approach that has gained widespread acceptance im this 
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sifwatson 1s to space small subsurface probes, usually un the form of shovel pits or test 
prt s, at a set interval along a survey transect. Nance and Ball (1986) have shown that 
the hkelihood of discovering sues with subsurface tests vanes directly with the 
attitact density and the size of the sites. Another key variable in determining site 
discovery potential 1s the intensity with which the fill of the test is inspected. The 
provabulity of site discovery increases dramatically wath a shaft from visual inspec- 
tson ot she fill to screening of the fill, and the probability increases still further as the 
size of the screen mesh decreases. It 1s worth pointing out, however, that even with 
a small interval berrccn subsurface tests and screening through fine mesh, the 
hkehhoud of missing small, low-density sites is usual: very high. 


The problem of burned sites 1s not confined to forested areas or regions where 
the ground surtace 1s obscured. Geomorphi changes can lead to buried sites in areas 
with good surtace visibility. For example, in some desert areas of the American 
Southwest, remains of the Hohokam culture (ca. AD 200-1450) can be found on the 
surtace. Pedestrian surtace survey results usually correlate fairly well with intact 
subsurtace deposits of this age. Remains of the preceding Archaic and Paleoindian 
penods, however, are not generally found on the surface. Sites associated with these 
periods tend to be found mm deep crosonal cuts or as the result of modern land 
disturbance or construction. Thus, an interpretation of negative results of pedes- 
trian surveys in these regions as meaning that no Archaic or Paleoundian sites hie in 
the survey area involves an inaccurate and unjustifiable logical leap from the surface 
to the subsurtace. 


The problem of buried sites 1s farrly widespread and will always have to be 
taken mto account when designing surveys to build predictive models. One 
approach ts to use the results of a geomorphic analysis as a means of stratifying the 
area. The paleo land surtaces identified could each be assigned a relative probability 
ot site discovery. This probability could be based on previous research, the types of 
depositional environments represented, or a combination of these factors. Ezcit 
stratum could then be divided into grid units and a number of grid units selected for 
survey through a random process. Optimum allocation of the number of grid units 
selected in each stratum could be based on the relative probabilities previously 
detined. Each grid unit could then be subdivided into smaller units, with a set 
number of these units being selected for subsurface tests through either a random 
of a systematic process. 


The sampling scheme described above is reterred to by statisticians as fro-stage 
sampling or «whampling (Cochran 1977). Although the statistics can become rather 
involved, this type of survey design can lead to unbiased and precise parameter 
estimates. Parenthetically, if the design is extended to sampling the subunits 
themselves, then it 1s referred to as three-stage sampling e* multicage sampling. The 
latter term ts often misused by archacologists to refer to sampling designs that are 
carned out im sequential steps (e.g., conduct a! percent simple random sample 
survey of a region [step I}, stratify the region [step 2], conduct a 10 percent 
stratified random sample survey of the region [step 3}, and so on). Although the 
term multitage seems entrenched in the archacological literature, to avoid confusion 
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with other uses of the term we reter to this type of sampling design as malnug 
throughout this volume. 


DATA COLLECTION IN CRM CONTEXTS 


The preceding discussion of survey strategies was presented as though we 
already knew where we were going to survey, how we were going to survey, and 
how we were going to record data. In practice, these are three of the most important 
factors that are mvolved in designing a survey. In an ideal setting, all three are 
determined on the basis of research objectives. But surveys conducted im cultural 
resource Management contexts are subject to a sect of umique Constraints that often 
greatly restrict the ways that these three factors can be integrated into the overall 
survey design. In the tollowing discussion we examine the ways im which manage- 
ment needs have shaped survey design and ev aluate the common responses to these 
needs in terms of their usetulness for model building. These rssucs will be discussed 
under three specific topics: survey umiverse, survey mtensity , and data recording. 


Survey Universe 


Ideally , the selection of an area to survey of trom whn h sample units are to be 
sekcted 1s based on theoretical propositions underlying the research design or 
tops. In theory, researchers want to select a survey universe that conforms to a 
cultural unst. In practice, however, at best we can only approxnmate this situation. 
Cultural systems rarely have sharp boundanes. Detining where one system ends 
and another begins 1s usually impossible for ethnographers, to sav nothing of the 
problem taced by archacologists. Further, cultural sy stems change through tume in 
nature and in size. Thus, asurvey universe suitable for studying one culture may be 
too large of too small tor examining tts predecess ws and its successofs. 


A common soiution to this dilemma ts to select a regeon that contorms to a 
natural unit, such as a drainage basin or an island, with the size and type of the 
natural unit selected depending on the research top. At one extreme, Sanders 
chose the entire Basin of Mexico as the survey universe tor a study of the origin of 
state-level societies in highland Mexico. The ensuing proyect lasted 15 vears and 
involved about 50 field months of actual survey (Sanders et al. 1979:19). Most 
projects are not nearly as large as the Basin of Mexico survey, but im all cases the 
selection of a survey universe ts a compromise between two opposing criteria. On 
the one hand, we want a region that ts large enough so that we can reasonably argue 
either that the remains of the prehistoric settlement systems that characterized the 
area are contained within the survey universe or that all mayor components of those 
systems are at least represented. On the other hand, we want the survey umiverse to 
be as small as possible, thereby allowing us to maximize our survey effort. 
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For surv-vs conducted m a cultural resource management context, the survey 
UMVeTse 1s Most often detined nor bs archaeologists but by land managers, who 
must take ito account many factors that have little to do with archacological 
research. In some cases the survey universe will encompass one or more natural 
units, Dut usually it wall not. It is appropriate, therefore, to consider the implica- 
toms that management-detined survey unnerses have on hutiding predictive 
models 


To illustrate these issucs we wall use the example of a Bureau of Land 
Management cultural resource management project in the San Katael Swell region 
of east-central Utah that was mentioned carer. The San Ratacl Swell is an 
clongated anticline approximately 110km (50m) ong and 530km 23 mu) wide. Since 
1979 the Bureau of Land Management has sponsored six major surt Cv projects mn the 
region Table 6.3). All of these proyects were designed as probabulisti sample 
surveys of management-detined survey universes encompassing difierent portions 
of the swell. Of the more than 550,00 ha (ca 1,360,000 actes) comprised by the San 
Ratae! Swell, more than 10,000 ha ca. 25,000 acres: 1.8 percent ot the total area) were 
inventoried as part of these seven projects 


The modeling etiorts carned out im conjunction with these SUTVEY Projects 
mirror the general trends in predictive modeling. The first locational analyses 
consisted of univariate and bivanate correlations between site location and specific 


TABLE 65 
Escumated density of prehrstoric sates om propect areas on and near the San Ratacl Swell 
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environmental vanables (Hauck 1979a, 1979b). Sagnuficant associations between site 
locations and environment were then combined into overlay models (Thomas et al. 
1981). Finally, most recent attempts use sophisticated multiwanate discriminant 
function analysis and hierarchical clustering models (Tipps 1984). 


None of these models has been a very good predictor. Thus ts not a result of 
glaring errors in the derivation of the samples or the application of the statistics. 
Instead, the poor accuracy rate appears to result from “tunnel vision.” Each model 
was based on an inductive, pattern-recognition approach that viewed the survey 
umuverse as the only region of interest. Even a casual glance at Table 6.3, however, 
indicates that wide fluctuations im mean site density exist. T his us probably also true 
of the sample variances, but except for a few cases, these are not published. These 
variations are probably caused by settlement and subsistence practices that are 
regional in scope. Thomas and his colleagues appear to recognize this situation and 
state that 


The Central Coal 1] Class H inventor sas dewgned as 2 10 percent smple random 
sample of three samplng umwetses Study Tracts 1, MM. and Area 1). The notmon 
underiying thes ty pe of approach ws that the survey results of the sampled portion of cach 
tract can be generalized over the entore tract. Tho method may be usetul for evaluating 
sete semertevety om Tracts | and fl. But Area 1 does mot appear to be a selt-comtamed 
cultural unet. Settlement m thes ates seems to be derecthy related te practices m the 
adyounung tegroms. .__ T ryung to generalize the results of the sampled portion of Arca Hil to 
the entire tract » apt to be musleading Mam of the cretacal features mm the settlement 
system clearly were not mcluded m the sampling umeers>. Instead of developing 4 
statustacally vald model that makes bettie logacal sense, ot seems tar preterable to create an 
mternally comsatent model of the settlement system chat can then be used to evaluate 
the Area lll postion of the system and thus te predict areas of ste semsetewety | Thomas et 
al. 1981-199) 


Thus, while 2 10 percent smple random sample of Central Coal 1, Area I might 
yield a representative sample of patra! wat) tor that area, it 1s quite possible that 
regronal patterns in the settlement system would go undetected on the basis of this 
sample. Even though the parameter estimates might be rehable, predictive models 
based only on patterns discernible within the sample universe are, om this case, 
hkely to yreld disappointing resuits. 


There 1s no easy solution to this problem. To build usetul predictive models 
the researcher must have reason to beheve that the survey universe contorms to a 
cultural unit, or tasling this, he or she must use a detendable proxy, such as a natural 
umut. If we must use managerent-detined survey universes, then it 1s critical that 
the fit between an appropriate cultural or natural unmet and the arbitrary umiverse be 
assessed. In addition, the resulting mode! must take wto account the position of the 
resources in the survey area relative to the larger settlement-subsistence system, 
and it must mcorporate regional factors atlecting settlement location. 


Designing a research strategy to accomplish this task may involve some 
restructuring of many cultural resource management programs. To use the BLM as 
an example, one solution would be to subdivide cach distnct mimto natural units. 
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instead of building a model tor cach major cual lease project, archacologists might 
build a predictive model for cach individual natural unt ethan the distract, sath 


the mode! being penodically refined as more data become 2. ailabie. Thus, the data 
trom a large coal lease project mught be beiter used im several regional predictive 
models instead ot one ad hoc, proyect-specific model. Such a program sould require 
that each proyect carmed out withun the destrnct emphasize the collecnion of compar- 
able data) While ot is possible to combune probabulsstic samples, this requires 
consudetable statestical expertise. A much more scnous problem 1s that of ensunng 
that the entere sampling umiverse 1s adequately covered. The usual government 
Policy ts not to survey privately owned land. In many arcas private property covers 
much of the “desired” Land, such as the bortomlands m a mver vallew or the 
elevated, wel -draimed souls im a deltaic plam. Many sites, inciuding a high propor- 
Hon of Magnet sites, ell often be tound on pnvate land. It »- chmmmate such areas 
from our sampling umwerse, our abibty to predact st on will be greatly 
hindered 


Certamly there are many problems involved mn deste: cutural resource 
management projects that tocus on culturally meanumgful stud, areas, but propects 
that emphasize development of ad hoc models for arbitrary ums are clearly as 
responsible tor the poor showrng of these models as anything cise. U niews thes focus 
changes so that the models can be related to cultural phenomena, it 1s unlikely that 


the results will emprove Ww 
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Survey Intensity 


Without doubt the sungle most important factor atlecting the number of sites 
located on 4 survey us the effort made to find them. Surver imtensity can be 
measured in terms of the rat o of person-days to square mules surveyed or on the 
basis of the spacing between survevors Judge 1981; S. Plog et al. 1978; Schaffer and 
Wells 1982). Regardles- of the measure used, all studies to date confirm Judge's 

IVSI:128) statement that “the more time spent im the field looking for sutes, the 
more sites well be found.” 


S. Plog et al. | 1978391 - 393) examined the relationship between survey mten- 
sity (as Measured by persom-days per square mile surveved) and estimates of site 
density using the results of 12 survevs conducted mm the southwestern United 
States. They found a strong positive hnear correlation between the two variables, 
which ts to say that as survey intensity increased so did site density. Part of this 
relationship ts a result of the tame spent in recordin, sites and making collections 
once the sites are tound. Thus, we would expect that as more sites are found more 
time must be spent om the field. §. Plog et al. (1978:393) argue convincingly, 
however, that this is not the whole story, that indeed, if one controls for extra tune 
spent recording sites, a strong postive relationshup still exists between survey 
intensity and site density. In theory, a pout of diminishing returns should be 
reached beyond w hich increases un intensity do not result in Proportional mereases 
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in site density estimates; for the 12 surveys studied, however, no evidence was 
tound that indicated that such a point had been reached (S. Plog et al. 1978:393; see 
also discussion in Chapter 4 of this volume). 


Selection of an appropriate jevel of survey intensity requires caretul considera- 
tion of several factors. The major consideration affecting survey intensity should be 
the research objectives. Is it necessary to locate “‘all”’ resources, or are we primarily 
interested in specific types of sites? Foi example, the Basin of Mexico survey project 
discussed earlier was designed to recover “a variety of data on where people had 
lived during the pre-Hispanic past in the survey area” (Sanders et a! 1979:15). The 
surveys, therefore, focused on habitation sites and made no attempt to identify 
more ephemeral, limited-activity loci. The selection of a flexible survey mterval of 
between 15 and 17 m (Sanders et al. 1979:24) was appropriate to these objectives. 


In cultural resource management contexts, surveys are rarely focused on a 
particular type of site, and even surveys designed to acquire data for a specific set of 
research objectives are uncommon. Usually the stated goal is to tind “‘all’’ the 
resources. Such a hubric ideal can never be achieved, however, and what 1s reaily 
meant by “all” is some very high proportion of the recoverable resources. 


Selection of an adequate survey intensity also depends on the nature of the 
resource base and the prevailing natural conditions. As discussed in Chapter 4, the 
latter directly influence our ability to detect archaeological material, a factor 
categorized by Schiffer and others as rrsability (Schifter and Gumerman 1977: 186-187; 
Schiffer and Wells 1982:349; Schiffer et ai. 1978:6). In general, high visibility means 
that if cultural remains exist on the surface an observer should be able to see them. 
High-visibility areas generally have sparse vegetation, e.g., deserts, beaches, or 
plowed fields. Low-visibility areas have masked or obscured surtaces. Pedestrian 
surface survey techniques yield poor results in these areas and must be supple- 
mented by subsurface investigations, such as shovel tests or test pits, or by 
techniques that expose the surface, like raking or plowing. 


Cultural factors affecting the likelihood of site detection include site size, site 
obtrusiveness, site distribution, and surtace artifact density. In general, larger sites 
have a better chance of being found than smaller ones; sites with high surtace- 
artifact densities are more likely to be seen than those with sparse or no surface 
expression; and sites with obtrusive features, such as mounds or masonry, are easier 
to find than sites lacking such features. While these generalizations may seem to be 
self-evident, they have important implications for the model-building process. 
Previously it was argued that in order to construct a successtul predictive model we 
need (a) to have reliable estimates of a number of parameters associated with site 
location, (4) to locate all or most of the magnet sites, and (:) to assess the effects of 
depositional and postdepositional processes on site visibility. To discover magnet 
sites, larg. areas must be covered, but often these areas can be surveyed at very low 
intensities without affecting the result. In contrast, accurate parameter estimation 
for less-obtrusive sites requires a much higher level of effort per area surveyed. 
Given these competing requirements, several archaeologists have recently advo- 
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cated multistep survey designs in which different types of data are acquired at 
different stages (e.g., Doelle 1976, 1977; Schiffer and Wells 1982; Schiffer et al. 1978). 


Since survey intensity directly affects the rate of site discovery, one wouid 
think that this issue would weigh heavily in che evaluation of proposed survey 
strategies. In practice, this is rarely the case. Many scopes of work specity the 
interval between surveyors; the types of subsurface tests, if any, that will be 
conducted; and the information that is to be recorded on each site. The rationale for 
providing these fixed specifications appears to be that this will ensure that all 
contractors bid on the same work. While the objective 1s understandable, it 1s 
important that the land-managing agencies realize the effects of this decision on the 
model-building process. When these aspects of the survey methodology are pre- 
specified, survey intensity becomes a parameter rather than a variable. Thus, what 
is probably the single most important factor affecting the power of any predictive 
model is being arbitrarily set by the managing agencies for reasons that have little to 


do with archaeology. 


The point is that selection of the survey intensity is a critical and integral step 
in the model-building process. The choice should be based on fieldwork and subject 
to testing and refinement, as well as to changes when the research objectives 
change. One contribution that the managing agencies can make to the accuracy rate 
of predictive models is to allow survey intensity to be set on the basis of archaeologi- 
cal considerations rather than procurement procedures. 


Data Recording 


In the preceding section we discussed some of the factors affecting the number 
and types of sites discovered. Yet we side-stepped perhaps the most important 
issue—what ts a site? To a large extent, site definition is actually an issue of data 
recording. That is, we need to define consistent and replicable criteria by which 
space can be partitioned into those areas that we want to call sites and those that we 
do not. Traditionally, this issue has not been problematic. Archaeologisis tended to 
focus on large sites with discrete boundaries, such as masonry pueblos or earthen 
mounds. In the last decade, however, some researchers have focused on loci where 
evidence of cultural activity 1s more ephemeral, such as isolated finds or low-density 
artifact scatters, and it has become clear that these phenomena can be quite 
important to our understanding of the prehistory of a region (e.g., Doelle 1976, 1977; 
Goodyear 1975; Teague and Crown 1983; Thomas 1975). 


This awareness of the continuous aspects of the archaeological record has led a 
number of archaeologists to question the utility of the site concept (Dunnell and 
Dancey 1983; Ebert et al. 1984; Thomas 1975; see also the discussion by Ebert and 
Kohler in Chapter 4). These investigators have rightly pointed out that sites do not 
behave; rather, people behave, and these behaviors have a spatial dimension that in 
no way correlates with discrete boundaries on a one-to-one basis. The problem of 
site definition is directly analogous to the “community boundary” issue, which has 
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been extensively debated in social anthropology tor the past 50 years (Bell and 
Newby 1974; Galeski 1972; Goodenough 1966; Leach 1961). The central point of this 
issue 1s that anthropologists have taken the village or settlement as their unit of 
analysis even though they recognize that people living in a village may work outside 
the village, may own land outside the village, may travel outside the viliage, and 
may have relationships with people living in other villages. The basic question, 
then, is at what point does the researcher set boundaries for the analysis and treat 
the resulting unit as an object of scientific inquiry? While no absolute answer has 
emerged, most anthropologists have used a spatial aggregate (whether it be a 
village, town, or city block) as the unit of analysis. They argue (at least implicitly ) 
that the people within this unit are more similar to each other than they are to 
people living outside the unit and or that they have more relationships with each 
other than they do with outsiders. In archaeology, Chang (1967, 1968) has put 
forward similar arguments in favor of using the settlement, defined as a single 
component, as the unit of analysis. As many critics of Chang’s approach have 
pointed out (Binford 1968; Clarke 1968:648), however, components can only be 
defined after the assemblage has been analyzed. 


While it may be obvious at the time of a survey that a mile-long lithic scatter 
represents multiple occupations, one still has to deal with the problem of recording 
it. Should an attempt be made to define discrete loci as separate sites, or should the 
entire area be labeled one site? Further, if the artifact scatter extends beyond the 
survey unit, should the entire scatter be recorded or only the portion within the 
unit? On the last point most archaeologists would agree that if part of a site 1s 
located in a sample unit, the entire site should be recorded. In practice, however, 
there are instances, such as coastal shell middens and lithic quarry sites, that can 
easily extend into two or more sample units and for which any boundary 1s 
somewhat arbitrary. 


There is no easy way to answer these questions in the abstract. Many agencies 
and institutions have tried to resolve them by adopting arbitrary criteria, such as a 
minimum of five flakes per 5 m?, for site definition. This practice 1s not without its 
problems, and it has important implications for model building. For example, 
consider two areas, one in which five flakes were found in a5 m? area and another in 
which four flakes were discovered in an area of the same size. Under the arbitrary 
definition given above, the first area would be recorded as a site and the second as 
containing four isolated finds. During the development of a predictive model, 
isolated finds are usually either ignored or given the same weight as sites. For the 
example above, this would result in a model that would either incorporate five sites, 
four of which are in exactly the same environment, or one site, with the area 
containing the four isolated finds being considered a nonsite. Does this make sense 
in terms of human behavior? Most likely it does not. 


The decisions as to what will be designated as a site and how that phenomenon 
will be recorded must therefore be based on the issues being addressed. In the case 
of the Basin of Mexico survey discussed above, interest focused on the development 
of complex societies, and the survey crews concentrated on finding habitation sites 





283 


ATLSCHUL AND NAGLE 





(Sanders et al. 1979). In contrast, in the Reese River survey Thomas (1975) was 
interested in settlement and subsistence patterns of Great Basin hunters-and- 
gatherers, and the basic unit of analysis shifted from the site to the artifact. 


While the definition of a site, or more precisely of the unit of analysis, must 
necessarily be related to the research question being addressed, 11 1s also critical that 
resources be recorded in a replicable and consistent fashion. Ideally, we should be 
able to record resources in a way that 1s independent of how a “site” 1s defined. For 
many state and tederal agencies the site itself is little more than a bookkeeping 
device for maintaining accurate records. For these purposes an arbitrary definizion 
will suffice. The problem, then, 1s to find a way to fill out site records for agencies 
using one definition, while retaining the capability to manipulate the data according 
to any of a number of other definitions. 


One approach to this problem is to view archaeological data as a senes of 
hierarchically arranged dimensions. The scale at which data are collected will 
determine in what ways they can be used in subsequent analyses. Data collected at 
more specific levels can usually be aggregated to express information at a higher 
level, but the reverse 1s not truc. For example, data on artifacts can be grouped to 
provide characteristics of features or sites (such as counts of different artifact types), 
but information collected at the site level cannot be used to derive information 
about artifacts or fe2tures found within sites. 


In view of the ongoing debate about the desirability of conducting “‘siteless” 
archacolegy (Dunnell and Dancey 1983; Ebert et al. 1984; Chapter 4 of this volume) 
within the context of predictive modeling, it may be worthwhile to explore the 
possibility of collecting tield data in several hierarchical levels, with the data being 
organized in such a way that relationships between levels are easily recoverable. 
That 1s, data could be collected at the levels of (a) the survey units, (4) the sites 
(however one might choose to define them), (c) the different activity areas or 
features within sites, and (d@) individual artifacts, whether from particular features or 
as isolated entities. Identification of the survey unit in which sites are found, the site 
in which features occur, and the features with which artifacts are associated (use of 
pointers to different levels in the hierarchy) would permit data from more specific 
levels to be aggregated or combined in order to provide variables containing 
information about the next higher level in the hierarchy. Durand and Davis (1985) 
have recently reported a similar scheme, which they designed to manage archaeo- 
logical resources in Nevada. Other states, such as Hawan, also have similar data base 
systems. 





Table 6.4 presents a hypothetical example of this approach. It depicts a 
four-level hierarchical design extending from the survey unit (the highest level) to 
the artifact (the lowest level). In Table 6.4 the relationships bet ween ditlerent levels 
in the hierarchy are maintained by labels on each successive record that identify, in 
turn, the survey unit, the site, the feature, and the artifacts (which are either 
isolated finds or parts of features). A particular level in the hierarchy can take ona 
null value in order to accommodate features that are not associated with sites in the 
traditional sense, as well as ssolated artitacts. In practice, the number of hierarchical 
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levels, their labels, and other details of implementation would be the responsibility 
of the system designer user. 


This methodology would make it possible to deal with cultural remains 
occurring either in packets termed sites or as individual items varying in density 
across the landscape. In the latter instance, if the spatial coordinates of artifacts have 
been recorded, their locations could be entered into density -contouring algorithms 
or 4-means analysis (Kintigh and Ammerman 1982) as the basis for activity area, 
feature, or site definitions. Furthermore, characteristics of lower-order records 
could be used in any number of ways to construct variables descriptive of higher- 
order entities. Artifact variables could be transformed, for example, to produce new 
information to describe the features or sites in which they were found. Similarly, 
data from features might be aggregated to characterize sites further, and data from 
sites might produce additional information on the survey unit in which they were 
located. Counts of different types of artifacts recovered from features might be 


285 











ATLSCHUL AND NAGLE 


transformed to construct a new variable of artifact density for each feature, or 
different types of features found on sites might De tallied to build a new variable to 
characterize sites. It is easy to envision many other kinds of aggregated vanables 
that could be developed trom lower-level constructs to charactenze higher-level 
entities. 


A second major issue in cultural resource management that directly affects how 
resources are recorded 1s the question of whether or not artifacts should be collected 
from the surface. Over the past decade a *‘no collection” policy has become 
standard for more and more federal agencies. The basic reasoning behind this policy 
1s that more information 1s lost by uncontrolled surtace collection than ts gained by 
having access to cultural materials in the laboratory (S. Plog et al. 1978; Schiffer and 
Gumerman 1977). Certainly many cultural resource inventory surveys are con- 
ducted without ben<t of a research design, and in these cases collecting artifacts, 
especially if they will not be analyzed, serves no useful purpose. But if the 
development of regional predictive models, such as those advocated earlier in this 
chapter, were to become a mayor objective, then results from all surveys could be 
used in the process of model building. In this case, the no-collection policy would 
have serious ramifications. 


While in theory a no-collection policy should not affect either the quality or 
the extent of artifact analyses, in practice there 1s little question that it does. In-field 
analysis requires a level of competence tor crew members that is gencrally not met. 
When in fact the requisite expertise 1s assembled, the costs of in-field analysis nse to 
a level comparable with laboratory analysis. There 1s no question that, as commonly 
used in cultural resource management, the no-cellection policy saves money. The 
question is, at what cost? 


Asis the case with so many survey decisions, the mmpact of the no-collection 
policy 1s different on different types of sites. For large sites with high surtface- 
artifact densities, this policy may not have serious negative effects. Ample numbers 
of temporal diagnostics can usually be found on the surface without much trouble, 
and even without diagnostics these sites are generally classifiable into one of only a 
small number of functional site types. Real problems can arise, however, when 
low-density artifact scatters are encountered. In such cases we usually need all the 
information we can get in order to even hope to define useful analytic units. Often, 
distinguishing criteria, such as the presence or absence of a certain chert type or the 
proportion of flake categories, will not have been devised at the time of fieldwork. 
Thus, even if crews are well trained tor in-field analysis, it 1s simply not possible to 
foresee all the observations that might prove to be informative. Further, detailed, 
technically complex analyses, such as wear pattern or orgamic residue analysis, may 
be reguired to address issues of site function. These simply cannot be conducted in 


the field. 


The no-collection policy is in part responsible for our current imability to 
distinguish usetul site classes, and it is unlikely that this situation will change until 
the policy 1s altered. The problem of lost provemence tor surtace-collected materials 
does not necessarily call tor the radical measure of prohibiting collections; contract- 
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ing agencies could simply require that provemence information be recorded. If a 
project 1s designed to collect data for predictive modeling, what 1s needed 1s not that 
we record less information but that we record more information and record it more 
accurately. 


The issue of accuracy is central to any discussion of data recording. Basically, to 
develop a predictive model three categories of data are needed: (a) locational data, 
(6) environmental contextual data, and (c) cultural data. It should be pointed out 
that, if nonsites are to be used in the predictive model, data on the first two 
categories must be recorded for each nonsite location as well. 


The importance of precise locational data may seem obvious for any project 
with the stated goal of developing a predictive model of site location. What 1s 
perhaps not so obvious 1s the difficulty of obtaining such data. S. Plog et al. 
(1978:415) cite experiments on Black Mesa in Arizona in which sites were revisited to 
check on locational accuracy. Considerable variation was found, with some sites 
located accurately and others having been plotted more than 200 m from their 
correct location. These problems tend to multiply as more researchers work in an 
area through time. In a Class I overview for the Upper Gila River mm Arnzona, Phillips 
et al. (1984) found that the same site had been recorded three separate times (twice 
by the same institution) and plotted in three different locations. Portions of another 
site had been recorded as two separate sites by survey teams who were recording 
only the portion of the site that fell within their project area. 


Locational errors such as those cited above indicate the need for some type of 
error-checking program within the survey design. Ideally, such a program would 
include “‘double-blind”’ tests in which a second survey crew with no knowledge of 
the first crew's results resurveys the same quadrat. This procedure would be 
especially helpful for federal agencies, such as the BLM, which have placed a high 
priority on maintaining comparable data standards between surveys. Double-blind 
tests allow us to assess locational accuracy, and because two crews record the same 
resources, they also permit us to examine variation in the other aspects of site 


recording discussed below. 


Another approach to assessing the accuracy of data recording 1s through 
random spot checks. Such a program would ensure that sites are recorded accu- 
rately, but it will not assess whether sites were missed. A third approach, often used 
on large surveys, 1s to use separate survey crews and recording crews. Survey crews 
mark encountered sites on a map (and, if possible, in the field) and the sites are then 
visited by the recording team. This approach has the advantage of providing a 
check on recorded site location and of improving the consistency of data recording. 
The recording crews usually have fewer people than the survey crews, and their 
members have been specifically trained to collect the desired data. 


Collecting environmental datas perhaps the most confusing and difficult area 
of data recording. The reason for this confusion appears to be that archacologists 
have only poorly developed theoretical notions about the relationships between 
aspects of site location and the environment. The prevailing tactic seems to be to 
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use as many environmental variables as possible, in the hope that something useful 
will fall out. The result has been a mushrooming of the quantity of environmental 
data being recorded. Less than a decade ago, one-or two-page survey forms were 
the norm. Today many forms are standardized at the state or institutional level, and 
often they are 10-15 pages long and accompanied by a 20- to 30-page instruction 
manual! 


It as not at all clear that the recent trend toward standardization has either 
improved the accuracy of data recording or provided the desired data. This is 
especially true of features of the environment that are difficult to distinguish 
without quantification, such as plant communities, or those that are known to 
change through tume (e.g., vegetative zones). Instead of having archacologists, who 
may be poorly trained, make many of these observations, a more efficient approach 
would be to determine for a specific proyect the environmental factors associated 
with site location (either through background research or in a pilot study ) and then 
train crews to make only these critical observations. This approach encourages 
flexsbility rather than standardization in recording procedures. It may be true that 
at some later date an archaeologist may find that data pertinent to a specific 
problem were not collected. But the same thing can happen even under the 
alternative approach of trying to record everything at once, and worse, the “record 
it all” approach probably increases the chance that whatever data are recorded are 
recorded inaccurately. 


Once we have decided what data to record, we then need to decide how to 
record them. Some types of data can only be observed and recorded in the field 
(e.g., site location, artifact assemblage, ei... but others, such as vegetation and 
slope, moght be recorded equally well esther im the field or in the laboratory. There 
1s No question that data collected in the laboratory are less expensive to acquire and 
easier for others to replicate than field-recorded data. Before the decision 1s made to 
collect data in the laboratory, however, the researcher must determine that the 
resulting information wall be sufficiently accurate and precise. Specifically, if mnfor- 
mation is going to be taken from 7.5-minute USGS quadrangles, the adequacy of 
these maps for providing *he data at the required scale must be tested, rather than 
assumed. Verification of test information taken from maps should be carned out 
beiore the research design is finalized, and the test data should be selected from a 
variety of environmental settings. 


The decision concerning whether to collect certain data in the field or in the 
laboratory will also be affected by several proyect-specific considerations. If field 
crew members do not have the training to recognize vegetation patterns or to 
distinguish different artifact types, it may be unrealistic to expect them to record 
such information in the field. On the other hand, there may be instances where 
variables exhibit interaction effects, making it necessary to record the data tor these 
variables in the tield— information that would otherwise be collected in the labora- 
tory. As a case in point, if site size falls below a particular threshold, it might be 
desirable to record some aspects of microtopography in the immediate vicinity of 
the site during the field visit. If laboratory determinations of slope rely on calcula- 








COLLECTING NEW DATA FOR MODEL DEVELOPMENT 


tions using relatively small scale topographic maps, the resultant data may reflect 
only a general average in the nexghborhood of the site. 


The third class of information needed for predictive modeling involves record- 
ing of cultural phenomena. In general terms we want to know as much as possible 
about the activities that took place at a locale and about the uming of those 
activities. Data pertinent to these objectives describe the nature of artifacts and 
features present and their spatial distribution. Many of the issues that were 
discussed in the context of areal survey have correlates at che level of site recording. 
For example, just as the spacing between surveyors 1s the single most umportant 
factor in determining the number and type of sites found mm a survey, the spacing 
between surface collectors is the primary determinant of the number of types of 
artifacts collected (or observed) at a site. Questions of sample size and fraction, unit 
size and shape, and sample design must also be resolved at the site level. 


There are, however, fundamental differences between regional survey and site 
collection strategies. At the regional level we begin with a cleasly defined sample 
universe. At the site level, the first issue to be decided 1s the boundary of the unit. In 
areas of high surface visibility, determining the areal extent of a site may not be 
problematic, in which case defining an appropriate collection or observation stra- 
tegy 1s relatively straightforward. At sites with minimal or no surface expression, 
much of the time spent recording the site will be devoced to defining the boundary, 
with little or no attempt being made to obtain a representative sample of the 
cultural assemblage. 


A second difference is that a. the site level we are sometimes in the position of 
being able to define the entire population of surface artifacts, or at least a very high 
proportion thereof. This 1s especially true of low-density artifact scatters. Often it 1s 
less time consuming to flag and map each artifact in the entire site area, and collect 
them if possible, than it 1s to grid the site and sample it. Further, because one of the 
mayor problems in predictive modeling i« site-clas_ definition, and especially func- 
tional definition of undiagnostic artefact scatters, complete distributional assem- 
blage «lysis 1s often a requirement rather than a luxury. 


In contrast, sites with high artifact densities will probably have to be sampled. 
TI ese sites are not likely to present © yor definitional problems, however, since 
they will usually yield diagnostic temporal and or functional data. Any type of 
probabilistic sampling design that ensures that all areas of the site are inspected 1s 
likely to yield the data necessary for site-class definition. 


Another type of site, the large, low-density artifact scatter, 1s much more 
troublesome. In many cases the designation of such a phenomenon as a “site” 1s a 
misnomer, if fe is taken to mean anything other than a defined area of cultural 
materials. These sites are usually interpreted as resulting from multiple occupations 
at which simular (or dissimilar) activities may have been conducted. If we are to have 
any hope of disentangling these multiple occupations, precise distributional infor- 
mation from large block units must be collected. Thus, the grain size of a grid placed 


over such a site must be at least as large as one cluster of artifacts and features. 











ATLSCHUL AND NAGLE 


Decisions about appropriate survey unit size and shape should be based on a 
preliminary reconnaissance of the site. Once a gnd has been established over the 
site, an appropriate number of survey units can be selected for sampling, with 
artifacts and features in each selected unit mapped and collected or observed and 
recorded. 


In some areas of widespread, low-density artifact scatters it 1s impossible even 
to distinguish where one site ends and another starts. Van T mes Button (personal 
communication, 1986), faced with such a situation in the San Luis Valley of Colo- 
rado, developed a survey procedure, termed transect recordimg, in which the location, 
length, and orientation of sets of 2m wide transects were specified. Transects were 
spaced every 100 ft and provemienced to a 0.10 mi? unit. Counts on all artifacts and 
on a specified list of environmental attributes found in each transect were made and 
computer coded. In this way an emtire 20,000-acre parcel was surveyed. This 
approach was highly successful in this case because the entire area could be 
considered one large, low-density scatter By not forcing the results imto an 
mappropriate concept (1.¢., sites), the researchers were able to make useful state- 
ments about the quantity and nature of cultural resources in a reliable and rep hica- 
ble manner. 


DATA PROCESSING 


The collection and processing of new data for predictive modeling, whether in 
the field or in the laboratory, has traditionally been a labor-intensive and largely 
inefficient process. The advent of computers held out the promise that the process 
of getting information from the field into a form that could be analyzed could be 
greatly speeded up and streamlined. During the 1960s and most of the 1970s many 
projects utilized large mainframe computers for this purpose, with varying degrees 
of success. Yet it was not until the rise of relatively inexpensive microcomputers and 
associated hardware and software that the potential of automated data processing 
came within the reach of the vast majority of archacologists. 


It is not our purpose here to review this rapidly changing field. Instead, we will 
discuss some of the factors that should be considered by those who wish to automate 
data collection and processing. 





Preliminary Considerations 


The process of collecting and recording data for predictive modeling should be 
carefully planned from the beginning of the proyect. As Sarasan (1981:48) has 
pointed out, once the research design has been selected and most of the data for a 
project have been collected, restructuring of the data system may be extremely 
time consuming, costly, or both, and in certain situations wt may indeed be 
impossible. 
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In-Field Data Re-~>rding Options 


At the present time many different system options designed to convert raw 
data into machine-readable form exist. Not all of those available are suitable for use 
in field-recording situations, however, and some that are adaptable to field use are 
more practical in certain settings than others. Factors other than intended location 
ot use also influence the chosce of an optimal data-recording system. One of the 
most | nportant considerations 1s to minimize the number of steps between data 
observation and machine-readable record, since this reduces the opportunities for 
transcription errors (Gaines and Gaines 1980; Nagle and Wilcox 1982). Considera- 
tions aflecting decisions about data recording will differ between field and labora- 
tory settings of a single project, and data recording will probably be subject to 
different constraints during cach new investigation. 

The most commonly used recording format 1s the familar, handwritten data 
code sheet, which 1s used in various permutations for coding site survey of artifact 
data. Data code sheets have been in use for a long tume and are not likely to be casily 
supplanted as the primary archacological tool for field data entry. Handwritten 
forms are highly portable, survive all but the most adverse field conditions, and 
provide a readily accessible hard copy of the information they contain. ““When all 
else tails, one can always go back to the field notes” is perhaps the most commonly 
held (if not always the most accurate) archacological perception of data recording. 
On the other hand, most code sheets filled mn by hand are not machine-readable and 
must go through a secondary transcription to attain this state, a step that has tl ¢ 
potential for introducing errors into the data. 


Nevertheless, variations of the handwritten data code sheet will continue to be 
used in gathering data, as they should for small to moderately sized projects. 
Because site survey and or artifact forms have to be transcabed, they should be 
designed to follow as closely as possible the intended flow of later machine entry. 
Chenhall (1975) lists many “do's” and “don't's” for those who anticipate develop- 
ing and using hand-completed forms as the first stage im data entry. 

Another well-known paper format, the optical mar, OMR, or OPSCAN torm, 
possesses many of the advantages of the handwritten data code sheet but 1s directly 
machine readable as well (Nagle and Wilcox 1982). Customized forms have been 
employed to create artifact records in the field (Nagle and Wilcox 1982), to code 
faunal data (Bonnichsen and Sanger 1977), ari to capture site data on several 
archacological surveys (Klinger 1977, cited um Schiffer et al. 1978:14; Scholtz and 
Million 1981:18). If creatively designed, customized OPSCAN forms represent a 
viable alternative to the use of handwritten code sheets for field data entry since 
they are well suited to handling mterval-scale data as well as other numerically 
codable, ordinal- and nomunal-scale variables in common use in predictive model- 
ing. OPSCAN forms might also be chosen as a means of data entry when poor field 
environmental conditions climunate or restrict the use of other automated 
possibilities. 
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One of the most promising areas of automation im field data collection 1s the 
continuing development of portable data collectors. These machines, often no 
larger than a standard calculator, record and store data m a machine-readable 
format that can subsequently be transferred to a more powerful and less portabk 
machine. Portable data collectors, or PDCs, have been used sunce the early 1970s mn 
the fields of forestry and mining (Cooney 1985). Many of the carly PDCs had 
dedicated functions, such as determining tree height or board feet, whach restricted 
their use to one discipline. 


In the late 1970s a number of archacologists began experumenting with the use 
of PDCs in field situations (Altschul and Sanders 1984; Stephen and Craig 1984). 


While the technique was promising, these researchers ran into a number of common 
obstacles: most notably, excessive power demands, programming difficulnes 
(many of the carly machines, such as the Hewlett-Packard 41 semes, could only be 
programmed in a language specific to that machine), storage lumitations, communi- 
cation problems, and the mability to produce paper copy in the field. With the 
advent of lap or notebook computers, virtually all of these problems have been 
solved. Computers are now readily available that cai casily be carned into the field, 
are battery powered, have built-in communication capabilities, and can utilize one 
or more high-level programming languages. Further, the development of battery- 
powered peripherals, such as microcassette drives and printers, provide the neces- 
Safty storage requirements demanded by archacological field situations as we'l as the 


capabilities to produce on-site hard copy of field forms and bit-mapped drawings. 


Laboratory Data Recording 


Although much has been made of the potential use of microcomputers on-site, 
thas 1s rarely teasible. By nature, surveys are mobile, and microcomputers (even the 
so-called portables) are ill-designed for this purpose. Microcomputers may be more 
useful on-site during excavations, but by and large the primary purpose of having a 
machine in the field is to record and store data, and in this role microcomputers 
(even with all their power and capabilities) are simply no match for the lighter, less 
expensive, and more maneuverable PDCs. 


Where microcomputers can be used effectively 1s in the laboratory. Here the 
computer can provide data entry, data storage and management, text editing, and 
statistical manipulations and can serve as a mechanism to communicate with and 
transport data to and from other micro- and mainframe computers. For survey 
projects of even moderate size, data management with a data base management 
system » orobably a cost-eflective strategy. As most archacologists are familar with 
these compxter capabilities, they will not be discussed further. 


We would like to note, however, shat all commercial statistical software 
packages (whether for a micro- or mainframe computer) with which we are familar 
require input data to be in the form of a sequential (and generally ASCII format) file. 
Records in such a file (so-called flat files) usually correspond to a survey unit, a site, 
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or an artifact, although these record types would not be interspersed on a sine. file. 
Since this will probably be the file format in which the wast majonty of predictive 
modeling analyses are conducted, anyone contemplating using a generalized data 
base management program to store and manmpulate hus or her data should 
cogmizant of the fact that st will be necessary to convert the records, or a subset of 
the records, to a flat file format pror to conducting statistical analyses. Fortunately, 
most software packages incorporate utility programs to accomplish this step easily, 
but the capability of creating a sequential outpet file structure should still be 
ascertained mm advance of selecting any particular software for data base management. 


CONCLUSIONS 


Thus chapter has presented an outline of the data needed to create a predictive 
model, some of the factors that should guide the devciopnent of a survey strategy 
to obtain those data, and the constraints of data collection mm a cultural resource 
management context. By virtue of the fact that different types of data are needed at 
different times to build a predictive model, the process lends itself very well to 
multistep survey designs. While cach situation will call for a distinct strategy, some 


general guidelines can be suggested. 
The first step of fieldwork should concentrate on three topics: (4) magnet 


sites, (6) depositional and postdepositional processes, and («) estimates of site 
density and of the range of site types. Some sort of informed probing of specific 
locations (1.¢., using information from local informa.t. «+ regional knowledge) 
combined with exteasive areal coverage (either through mmagery or aci ual flyover) 
should detect a large proportion of the magnet sites. A detailed geoarchacological 
analysis should provide the necessary information on paleo land surfaces as well as 
indicate past trends in environmental conditions. Finally, some type of small-scale 
probability sample survey can be used to calculate working density estumates and to 
obtain some notion of the range in variability in site types. Sample universes should 
conform to natural units, and the area to be surveyed should be stratified if previous 
information can lead to the definition of justifiable strata; otherwise, a simple 
random sampling approach 1s advisable. The level of survey intensity for this first 
stage should probably be high. 

The second step of fieldwork should be devoted to obtaining the specific 
information needed to develop the predictive model. Data must be gathered on the 
relationship between site locations and environmental features and between sates 
and other sites. Based on the preliminary density estimates and the location of 
magnet sites, the sample universe(s) should be stratified if at all possible. For 
example, catchment zones can be defined around each intrinsically important site 
and treated as separate strata, as can environmental zones that show wide ranges in 
site density. Optimal allocation formulas can be used to maximize survey resources. 
It may be necessary to increase the grain size of the grid during this stage of the 
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survey. This will especially be true of terest focuses on mterute relanonshups 5° 
that a high propornon of survey units must contain two or more sites. 


A third step of fieldwork us necessary to test the model. At this stage some form 
of purpouve selection may be used to designate for survey areas predicted to 
contain sites and areas predated not to contain sates. Alternatively, the region 
could be stratified on the basis of hngh, medium, and low probability of site locaton, 
and each resulting stratum could be sampled according to some probabdistic 
design. Also at this tume the geomorpha map of the survey area should be tested. 
One approach would be to place subsurtace tests, such as deep cores, test pits, or 
shovel probes, according to some multistep samphng demgn. A second poss lity 
would be to use some type of subsurtace test, such as backhoe trenches at specific 
locales along an alluvial terrace. 


Discussions of multistep survey designs are not new im archacology (¢.g., 
Bintord 1964; Judge et al. 1975; S. Plog et al. 1978; Schiffer «ad Wells 1982; Schiffer et 
al. 1978). Implementation of such designs, however, 1s less common, and multistep 
surveys are almost nonexistent within cultural resource management contexts. By 
its very nature predictive modeling is a multifaceted process; it 1s umportant, 


theretore, that surveys designed to collect data for predictive modeling projects be 
multistep as well. 


REFERENCES CITED 


Altschul, jetirey Hi 
1986 Bag Hell Fe araton 4 Meltnomponear Midden Mownd on the Facttort F alin), Southeast Otlahoma 
Report of Investigation No. #1-1. New World Research, Pollock, Lowmana 


Altschul, jetirey H., and john C. Sanders 
184 «6An Automated Approach te Archaeological Ste Mapping and Recording Ms. on file, 
Statestical Research, Tucson 


Bell, Colm, and Howard Newby 
1976 Commanity Stade: 40 larroduction to the Socsology of the Local \mmumry. Pracget, New ‘gok 
Bintord, Lewn R 


14) =A Conmmderation of Archacologucal Research Desagn. fern an famgurty 29475 44) 
1966 Rewew of Aerhemting fr haroiogy, by KC Chang. Erteotertery | 422-420 


Bhs, C1 
1953) = Rettong the Negatiwe Brnormal Dutribution to Biologncal Late Bromerrn.) & 176-200. 


Bonmichsen, Robson, and David Sanger 
1977) Integratong Faunal Analyses Camadian Journal of Arioherodogy | 108-154 


Butrer, Rerl W 
197) Eariromment and Ariharciogy: 40 latrodaction to Plenturme Geology Aldune, ( hcago 


1982) drvharology a Haman biology Method amd Theory tor 4 Comtratual Approach ( ambrudge Unver- 
ety Press, Cambrndge, tngland 








> oF PROV Aus. 
COLLECTING NEW DATA FOR MODEL DEVELOPMENT 


Chang, K. C. 
1967 Rethinking Archacology. Random House, New York. 


1968 Secilement Archaeology. National Press, Palo Alto. 


Chenhall, Robert G. 
1975 Museum Cataloging wm the Computer Age. American Association for State and Local History, 
Nashville. 


Clarke, David L. 
1968 Analytical Archarology. Methuen, London. 


Clarke, David L., editor 
1977 Spatial Archacology. Academic Press, New York. 


Cliff, A. D., and J. K. Ord 
1973 Spatial Autocorrelation. London. 


Cochran, Wilham G. 
1977 Sampling Techmgues, 3rd ed. Wiley and Sons, New York. 


Cooney, Timothy M. 
1985 Portable Data Collectors, and How They're Becoming Useful. Journal of Forestry 83:18-23. 


Cowgill, George L. 
1975 A Selection of Samplers: Comments on Archaco-Statistics. In Sampling in Archacology, edited 
by James W. Mueller, pp. 258-274. University of Arizona Press, Tucson. 


Dacey, Michael F. 
1964 ‘Modified Poisson Probability Law for Pomt Patterns More Regular than Random. Annal; of the 
Assocation of American Geographers 54:559-565. 


Davidson D. A., and M. L. Shackley, editors 
1976 Geo-Archacology: Earth Scrence Past and Present. Duckworth, London. 


Dixon, C., and B. Leach 
1978 Sampling Method: for Geographical Research. Concepts and Techniques in Modern Geography 
No. 17. Geo Abstracts, Norwich. England. 
Doelle, Wilham H. 
1975 Desert Resources and Hobokam Subsistence: The Conoco Florence Project. Archaeological Series No. 103. 
Anzona State Museum, Tucson. 





1977, A Multiple Survey Strategy for Cultural Resource Management Studies. In Comerration 
Archaeology: A Guide for Cultural Resource Management Studies, edited by Michael B. Schiffer and 
George |. Gumerman, pp. 201-209. Academic Press, New York. 


Dunnell, Robert C., and Wilham S. Dancey 
1983 The Siteless Survey: A Regional Scale Data Collection Strategy. In Advances m Archaeological 
Method and Theory, vol. 6, edited by Michael B. Schiffer, pp. 267-287. Academic Press, New York. 


Durand, Stephan R., and Jonathan O. Davis 
1985 Archaeological Computer Application with an IBM-PC. Advances in Computer Archarology 
2:21-38. 


Ebert, James I., LuAnn Wandsmider, and Signa Larralde 
1984 Theoretical, Methodological and Economic Aspects of Nonsite Surface Survey, Nonsite 
Sampling and Predictive Modeling. Paper presented at the 49th Annual Meeting of the Society 
for American Archaeology, Portland. 


Flannery, Kent V., editor 
1976 The Early Mesoamerican Village. Academic Press, New York. 





ATLSCHUL AND NAGLE 


Gaines, Sylvia W_, and Warren M. Gaines 
1980 Future Trends m Computer Apphcations. American Antigusty 45:462-471. 


Galeski, Boguslaw 
1972 Basu Comepts of Rural Sonology, translated by H. C. “tevens. Manchester Uniwersity Press, 
Manchester, England. 


Goodenough, Ward H. 
1966 Property, Kin, and Commumty of Truk. Archon Books, Hamden, Connecticut. 
Goodyear, Albert C. 
1975 Hecla Ul and I: An Interpretive Study of Archacologual Remains from the Lakeshore Project, Papago 
Reservation, South Central Arizona. Anthropological Research Papers No. 9. Anzona State Univer- 
sity, Tempe. 


Harvey, David 
1967 Models im the Evolution of Spatial Patterns m Human Geography. In Model: mn Geography, 
edued by Richard J. Chorley and Peter Haggett, pp. 549-608. Methuen, London. 


Hassan, F. A. 
1979 Geoarchacology: The Geologist and Archacology. Amerwan Antiquity 44:267-270. 
Hauck, F. R. 
1979a Cultural Rew ce Evaluation im Central Utab, 1977. Cultural Resource Senes No. 3. Bureau of 
Land Management, Salt Lake City. 


1979b Cultural Resource Evaluation im South Central Utah, 1977-1978. Cultural Resource Series No. 4. 
Bureau of Land Management, Salt Lake City. 


Hayes, Wilham L., and Robert L. Winkler 
(971 Statutws: Probabilty, Inference, and Decision. Holt, Rinchart and Winstor , New York. 


Hawvnes, C. Vance, Jr. 
. 4% Geochronology of Late-Quaternary Alluvium. In Meam of Correlation of Quaternary Succesom, 
edited by R. B. Morrison and H. E. Wright, jr., pp. S81. 931. Proceedings of the VIL INQUA 
Congress, vol. 8. Univ «sity of Utah Press, Sait Lake City. 


Haynes, C. Vance, Jr., and G. A. Agogino 
1966 Prehistonc Springs and Geochronology of the Clovis Site, New Mexico. American Antiquity 
31:812-821. 


Hodder, lan 
1977 Some New Directions in the Spatial Analysts of Archaeological Data at the Regional Scale. In 
Spatial Archacology, eduved by David L. Clarke, pp. 223-352. Academic Press, New York. 


Hodder, ian, and Clive Orton 
1976 Spatial Analysts in Archaeology. Cambridge University Press, London. 


‘dudson, John C. 
196% A Location Theory for Rural Settlement. Annals of the Avociation of Amerwan Geographers 
59: 365-381. 


Jacobsen, T., and R. M. Adams 
1958 Salt and Silt in Ancient Mesopotamia Agriculture. Scsence 128:1251 - 1258. 


Judge, W. James 
1981 Transect Sampling in Chaco Canyon—Evaluatior of a Survey Technique. In Archarologwal 
Surveys of Chaco Canyon, New Mexwo, by Alden C. Hayes, david M. Brugge, and W. james Judge, pp. 
107-137. Publications in Archacology I8A. National Park Service, Washington, D.C 


Judge, W. James, James |. Ebert, and Robert K. Hitchcock 
1975 Sampling in Regional Archacological Survey. In Sampling in Archarology, edited by James W. 
Mueller, pp. 82-123. University of Anzona Press, Tucson. 














Pre? nary ayer 
our I FYaT ABE 
COLLECTING NEW DATA FOR MODEL DEVE « PMENT 


King, Leshe J. 
1969 Sratetwal Analysi m Geography. Prentice-Hall, Englewood Cliffs. 


Kintigh, Keath W_, and Albert |. Ammerman 
1962 Heursstic Approaches to Spatial Analysis in Archacology. 4meran Antigusty 47-31-63. 


Klinger, Timothy C., assembler 
1977 New Hope: An Archacologual Assescment of a Proposed Strip Mime Tract um the Gulf Coastal Pla of 
Southwest Artamas. Arkansas Archacological Survey, Fayetteville. 


Krakker, J., M. Shortt, and P. Welch 
1983 Design and Evaluation of Shovel-T «st Sampling in Regional Archaco..gical Survey . Journal of 
Field Archacology 10-469 -480. 


Kvamme, Kenneth | 
1983 A Manual for Predictive Sate Location Models: Examples for the Grand Junction District, 
Colorado. Draft submitted to the Bureau of Land Management. Grand Junction District, 
Colorado. 


Larralde, Signa, and Susan M. Chandler 
1981 Archaccloguwal Inventory um the Seep Ridge Cultural Study Tract, Uta County, Utah, with a Regional 
Preductove Model for Site Location. Utah Cultural Resource Series No. 5. Bureau of Land Manage- 
ment, Sait Lake Coty 


Leach, Edmund R 
1961 Pad Eltya: 4 Village on Ceylon. Cambridge University Press, London. 
Laghtioot, Kent G. 


1986 Regronal Surveys in the Eastern United States: The Strengths and Weaknesses of Impie- 
menting Subsurtace Testing Programs. American Antiquity 51:458-504. 


Martin, Paul §., and Richard G. Klem, editors 
1984 Quaternary Extinction: A Prebutorw Revolution. University of Anzona Press, Tucson. 


Matson, Rachard G., and Wilham D. Lipe 
1975 Kegsonal Sampling: 4 Case Study of Cedar Mesa. In Sampling in Archacology, edited by james 
W Mueller, pp. 124-143. Uniwersity of Anzona Press, Tucson. 


Mav 1-Oakes, W. 1. and R. 1 Nash 
1%4 Archeological Research Design — A Critique. Paper presented at the 63rd Annual Meeting of 
the American Anthropological Association. 


Mc! mnamon, F 
194 Discovering Sites Unseen. In Advances in Archacologwal Method and Theory, vol. 7, edited by 
\t. bo) B. Schiffer, pp. 223-292. Academic Press, New York. 


Millon, Rene 
1972 The Teotthuacan Map, Part One: Text. Urbanization at Teotihuacan, Mexico, vel. 1. University 
ot Texas Press, Austin. 


Mueller, James W. 
1974 The Use of Sampling mm Archacologwal Surrey. Memoirs of the Society for American Archacology 
No. 28. 


1975 Archaeological Research as Cluster Sampling. In Sampling mm Archacology, edited by James W. 
Mueller, pp. 33-41. University of Arizona Press, Tucson. 


Mueller, James W., editor 
1975 Sampling mm Archarology. University of Arizona Press, Tucson. 
Nagle, Christopher L., and U. V. Wilcox 
1982 Optical Mark Recognition Forms in Data Entry: Some Applications. Jowrnal of Field Archacol- 
ogy 9:538 -547 








ATLSCHUL AND NAGLE 


Nance, Jack D. 
1983 Regional Sampling in Archacological Survey: The Statistical Perspective. In Advances im 


Archarologual Matbod and T beory, vol. 6, edited by Michael B. Schiffer, pp. 289-356. Academuac Press, 
New York. 


Nance, Jack D., and Bruce F. Ball 
1986 No Surprises? The Rehability and Vahdity of Test Pu Sampling. Amerwan Antigusty 
51:457-483. 


Neyman, J. 
1934 On the Two Different Aspects of Representative Method: The Method of Stratified Sam- 
pling and the Method of Purposive Selection. Journal of the Royal Statutwal Soarty 97-558 -C06. 


Orton, Clive 
1980 Mathematus m Archaeology. Cambridge University Press, London. 


Phillips, David A., Jr., Linda L. Swann, and Jeffrey H. Altschul 
1984 =Prebutory and History of the Upper Gila River, Artzona and New Mexwo: An Archacologual Orerriee. 
Western Division Report of Investigation No. 2. New World Research, Tucson. 


Plog, Fred T. 

1981 Managing Archaeology: A Background Document for Cultural Resource Management on the Apache- 
Sutgreares Natwnal Forests, Arizona. Report No. |. Forest Service, Southwestern Region, Albu- 
querque. 

Plog, Stephen 

1976 Relative Efficiencies of Sampling Techniques for Archaeological Surveys. In The Early Meso- 

american Village, edited by Kent V. Flannery, pp. 136-158. Academic Press, New York. 


1978 Sampling in Archaeological Surveys: A Critique. American Antiquity 43:280-285. 
Piog, Stephen, Fred Plog, and Walter Wait 


i978 Decision Making in Modern Surveys. In Advances in Archacologual Method and Theory, vol. |, 
edited by Michael B. Schiffer, pp. 383-421. Academic Press, New York. 


Read, Dwight W. 
1975 Regional Sampling. In Sampling in Archacology, edited by James W. Mueller, pp. 45-60. 
University of Anzona Press, Tucson. 
Redman, Charles A. 


1974 Archacologual Sampling Strategies. Modules in Anthropology No. 55. Addison-Wesley, New 
York. 


Reed, Alan D., and Susan M. Chandler 
1984 A Sample-Onented Cultural Resource Inventory im Carbon, Emery, and Sanpete Counties, 
Utah (draft). Nickens and Associates. Submitted to Bureau of Land Management, Contract No. 
YA-553-CT2-1080. Copies available from Bureau of Land Management, Moab District Office, 
Moab, Utah. 


Rogge, A. E., and T. R. Lincoln 
1984 Predicting the Distribution of Archaeological Sites: A Case Study from the Central Anzona 
Project. Paper presented at the 49th Annual Meeting of the Society for American Archacology, 
Portland. 
Sanders, Wilham T. 
1965 he Cultural Ecology of the Trotthuacan Valley. Department of Sociology nd Anthropology, 
Pennsylvania State University, University Park. 


Sanders, Wilham T.., Jeffrey R. Parsons, and Robert S. Santley 
1979 The Basin of Mexico: Ecological Proceswe im the Evolution of a Civilization. Academic Press, New 
York. 














BEST COPY AUANARLE 


COLLECTING NEW DATA FOR MODEL DEVELOPMENT 


Sarasan, Lenore 
1981 Why Museum Computer Projects Fail. Mascam News January February:40-49. 


Saucier, Roger T. 
1974 Quaternary Geology of the Lower Mussnuapp: V alley. Research Series No. 6. Arkansas Archaeological 
Survey, Fayetteville. 


Schiffer, Michael B., and George J. Gumerman, editors 
1977 Comerratson Archacology: A Guade for Cultural Resource Management Studies. Academuc Press, New 
York. 


Schiffer, Michael B., and S. Wells 
1982 Archacological Surveys: Past and Future. In Hohokam and Patayan: Prebutory of Southwestern 
Arizona, edited by R. H. McGuire and M. B. Schiffer, pp. 345-383. Academac Press, New York. 


Schiffer, Michael B., Aian P. Sullivan, and Timothy C. Klinger 
1978 The Design of Archaeological Surveys. 4 ‘orld Archarology O(1):1-28. 


Scholtz, Sandra C., and Michael G. Millon 
1981 A Management Information System for Archacological Resources. In Data Bank Applications um 
Archaeology, edited by Sylvia W. Gaines, pp. 15-26. University of Anzona Press, Tucson. 


Stephen, David V. M., and Douglas B. Craig 
1984 Recovering Theis Past Ba by Ba with Microcomputers. Archarology 37(4):20-26. 


Teague, Lynn S., and Patnca L. Crown, editors 
1983 Specsalrzed Actirity Sits. Hohokam Archacology Along the Salt-Gila Aqueduct, Central An- 
zona Project, vol. 3. Archacological Senes No. 150. University of Anzona, Tucson. 


Thomas, David Hurst 
1975 Nonsite Sampling in Archaeology: Up the Creek without a Site? In Sampling on Archacology, 
edited by James W. Mueller, pp. 61-81. University of Arizona Press, Tucson. 


Thomas, Prentice M., Jr., Carol S$. Weed, L. Janice Campbell, and Jeffrey H. Altschul 
1981 The Central Coal Il Project: A Class I Inventory of Selected Portion: of Carbon, Emory, and Serr 
Counties, Utah. Report of Investigation No. 25. New World Research, Pollock, Loursiana. 


Topps, Betsy L. 

1984 The Tar Sands Project: Cultural Resource Inventory and Preductire Modeling im Central and Southern 
Utah. P-IN Associates, Cultural Resources Report 405-1-8401. Submitted to Bureau of Land 
Management, Contract No. YA551-CT3-340038. Copies available from Bureau of Land Manage- 
ment, Richfield District Office, Richfield, Utah. 


Wobst, H. M. 
1983 We Can't See the Forest for the Trees: Sampling and the Shapes of Archacological Distribu- 
toons. In Archarologwal Hammer: and Theoris, edited by |. A. Moore and A. S. Keene, pp. 37-85. 
Academac Press, New York. 


Wood, John |. 
1971 Fattang Discrete Probability Distributions to Prehistoric Settlement Patterns. In The Dastriba- 
tron of Prebutorn Population Aggregate, edued by G.|. Gumerman, pp. 63-82. Anthropological 
Reports No. 7. Prescott College Press, Prescott, Arizona. 














Chapter 7 


USING EXISTING ARCHAEOLOGICAL SURVEY 
DATA FOR MODEL BUILDING 


Kenneth L. Kvamme 


This chapter examines the use of existing archaeological survey data for the 
development of archaeological locational models. Observe that if an a priori deduc- 
tive modeling strategy 1s being pursued, then there is no need for site survey data of 
any kind for model development (since presumably the “rules” of prehistoric site 
placement will be derived through theoretical or other means). Hence, this chapter 
necessarily is oriented toward quantitative model development based on patterns 
exhibited by empirical data, in this case existing site survey data. 


A fundamental assumption made throughout this chapter, unless otherwise 
stated, is that the archarological site is the basic unit of analysis. For some strategies, a 
grid cell of small size (e.g., 50 by 50 m) that contains a site or a significant amount of 
prehistoric cultural evidence is the unit of analysis, but this grid cell type of unit can 
be assumed to be included in discussions using the site concept. 


Our primary concerns when using existing site survey data are with locational 
and site content information because these two types of data are impossible to 
obtain without additional survey. We are interested in the /ocations of known sites 
because most empirical modeling strategies are based on patterns identified in 
various characteristics of site locations. We are interested in site content information 
for clues that might suggest site function or type, cultural affiliation, or period of 
occupation. These data are important because we want ideally to develop models 
for specific types or period groupings of sites. As noted in Chapter 8, however, 
trustworthy inferences about site function often are difficult to make based on site 
survey information, and for many sites all tnat can be said is that a prehistoric site is 
present at some location. 


A third type of information that usually 1s available in existing site survey 
reports includes various environmental descriptions pertaining to a site’s situation 
(¢.g., vegetation, soils, landform). Although environmental data usually are the 
very information that is needed for many modeling strategies, the kinds of environ- 
mental data commonly included with most site reports often will not coincide with 
the data requirements of a locational analysis and modeling strategy, and in any 
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case, the environmental observations usually are inconsistently recorded trom site 
to site. Fortunately, the environmental information reported in existing site survey 
data 1s not critical to locational modeling because such data can be observed and 
measured (and consistently and reliably measured) in virtually amy manner on 
various kinds of maps, aerial photographs, or even through remote-sensing or 
computer-based geographic information systems techniques (see Chapters 9 and 10). 


Collectively, existing site survey data torm a large and underutilized body of 
information that 1s available in almost any region of study. This body of data 
represents the cumulative effort of, perhaps, decades of archacological work per- 
formed at considerable cost. Although archaeologists might argue that random 
samples of site survey data (collected on the basis of regsonal probabilistic sampling 
designs) are necessary to make valsd regionwide generalizations, new surveys are 
expensive. Moreover, such an argument neglects an important source of potentially 
abundant and useful information in the form of existing site survey data. It could be 
that existing data are well distributed throughout a region of study and are 
“approximately representative” of a region’s archacology. Alternatively, using 
procedures discussed in this chapter, it might be possible to make existing data 
better represent the archaeology of a region through removal or reduction of 
apparent biases. If existing site survey data could be used in locational studies in 
place of new survey data, considerable savings in time and cost could be realized. 


Ot course, the quality of existing site survey data might be questionable and 
biases might exist in those data. A major focus of this chapter 1s on ways of 
removing, or at least reducing, apparent biases from existing data bases in order to 
obtain better-quality analysis data sets for use in model development or testing. 
There 1s no procedure that can correct all biases, of course, and it certainly 1s not 
possible to make good data out of bad, but a number of procedures are available that 
can be used in an effort to reduce certain biases. In most cases, existing archacolog:- 
cal data bases do not constitute a representative sample of the archacological 
remains in a region of interest; even in cases where some type of random sample 
survey results are available, the procedures discussed in this chapter wall be usetul 
for preparing other available data for use as, among other things, atest sample with 
which to assess the performance of site-location models independently (see “ Assess- 
ing Model Performance” and “Independent Tests,"’ Chapter 8). Problems im the 
use of existing data are myriad, and only a few can be discussed in detail here. The 
following pages consider the implications of these problems for model building with 
existing data. (The statistical and mathematical details of model development are 
discussed in Chapters 5 and 6; the application of these methods in model develop- 
ment and testing 1s illustrated in Chapter 8.) 


USE OF EXISTING DATA FOR SITE-LOCATION MODELS 


A few years ago I conducted a large survey designed to yield a random sample 
of prehistoric sites that was to be used for developing archacological models of ute 














USING EXISTING DATA FOR MODEL BUILDING 


location for the region studied. After the survey was completed I had the opportu- 
nity to meet with a statistical consulting group im a university mathematics 
department. | presented maps illustrating our random sampling design and the 
locations of the sites that we had discovered. The same maps also happened to show 
the locations of a few hundred sites known to exist pnor to the survey. Although we 
discussed several interesting topics, the one that struck me most forcibly was that 
the statisticians were amazed that | had conducted such a large and expensive 
survey when several hundred site locations were already known for that region. 


This reasoning went against all my archaeological training and against what I 
perceived as an accepted notion in settlement archaeology: that in order to make 
valid regsonal inferences about archaeological site location (or any other) patterning 
one needed representative samples chosen on the basis of probabilistic sampling 
theory. This position has been stated by Binford: 


Probability sampling ss . . . a mayor methodological emprovement whach, of executed on all 
levels of data collectson um full recognition of the mbherent differences mm the nature of 
observ ational populations w hoch archacologists vestigate, can result mm the production 
ot adequate and representative data usetul m the study of cultural process | 1966-439]. 


The statisticians did admit that my sample seemed very nice, but they pointed 
out that sampling 1s a pragmatic effort conducted for the purpose of reducing costs. 
That | had sampied in the first place indicated a concern for cost, yet | conducted an 
expensive survey even though the previously known sites existed in large numbers 
and appeared to be well distributed in the region. When | asked them about the role 
of statistical theory in model development, they suggested that I worry less about 
theory and more about how well the model works im practice. 


In Chapter 8 it 1s emphasized that from a statistical standpoint amy procedure — 
ranging from statistical techniques to simple mathematical rules or even armchair 
theory —might appropriately be used as a basis for site-location model develop- 
ment. What matters 1s how well a model works in application, how accurately it 
performs on future cases. Given this perspective, it is appropriate to use any type of 
procedure as well as any source of data (such as existing site-file information) in 
model development. In order to determine how well a model will perform in 
practice (and here I refer to any type of model, including those formulated deduc- 
tively), independent testing procedures are required, and in this case methods of 
statustical inference must be applied. Independent testing means that a model is apphed 
to data independent of the data set used to build the model (note that deductively 
derived models are not built with data, and therefore any data set 1s independent of 
these models), which provides a test of model performance. Statistical theory can 
then be applied to the test results (if the test data constitute a representative 
sample) in order to assess the significance of the resulting model performance and 
construct confidence limits around model accuracy rates. (The reader is referred to 
the section on assessing model performance in Chapter 8 for a more detailed 
discussion of these issues and procedures.) The purpose of the current chapter is to 
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¢xamune various problems that can be encountered in existing archacological site 
survey data bases and to recommend ways of correcting some of the more appareat 
problems in order to obtain data sets better suited for regional analysis and 
modeling purposes. In uther words, this chapter examines methods for reducing 
obvious biases so that the location -| patter + apparent in a final analysis sample of 
existing archacological data are more likely io be representative of overall locational 
patterns within the region of interest. 


PROBLEMS AND BIASES IN EXISTING SITE SURVEY DATA 


When examining existing archacological site survey data bases from a regson 
one 1s often struck by the great variation apparent in the quality of the data. It has 
been appropriately noted that the greatest source of variability in the archacologacal 
record may be duc .o the behavior of the archacologist. This variation stems from a 
number of factors, ranging from differing standards of quality or practice between 
different archacologists to changes through ume in accepted field practices to 
variability in the goals and research plans of individual survey projects. 


The ways in which different field projects, archacologisis, and field crews 
perform fieldwork and define, identify, and record archacological sites introduce 
the mayor sources of variation, bias, and inconsistencies mm existing site data bases. 
Chapter 4 describes some of the operational problems in defining sites on the basis 
of diffuse scatters of artifacts. Sites defined by one proyect may not constitute sites 
by another project's definition. Not only does the lack of standard archaeclogical 
procedures, such as field methods and operational definitions of sites, create 
inconsistencies in the data base, but differences in research goals from project to 
project, even within a single region, create major inconsistencies in regional data 
bases. 


The problem gocs deeper than this, however. Even within a single project, 
sites might, in practice, be defined differently owing to differences im the quality of 


individual field personnel and crews or because of other factors, such as msect 
density, adverse weather, terrain roughness, crew tiredness, or the arrival of a 
Friday afternoon. Budgetary constraints can also influence the quality of data 
collected when, for example, a contractor has a fixed price contract but site densities 
are greater than expected; this can lead to “hurrymg™ the survey. Schiller and 
Wells (1982:346) note that “th.s 1s accomphshed by increasing crew spacing or 
reducing the recording time. We suspect that such mod:‘ications in techmque are 
rather common, if seldom admitted in the final report.” These practices, of course, 
can lower the quality of the resultung data. 


Several factors influencing archacological survey results and the quality of the 
retreved data are summarized by Schiffer and Wells (1982). A principal factor 1s 
survey intensity or crew spacing. Crew spacing not only affects site discovery rates 
but also the sizes of discovered sites (Plog et al. 1978). Small sites and cultural 
features tend to be missed when crew spacing is large (W andsmider and Ebert 1984). 
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Narrow spacing, however, dramatically increases survey tome and effort and there- 
fore costs (Figure 7.1). 


The nature of obfraarcacs of the archacological evsdence determines the 
hkehhood that a partscular archacologycal feature, such as a site of an artifact, will be 
discovered given a specified level of survey intensity (Schaller et al. 1978). A mound 
or architectural teature, for exampic, has a higher chance of discowery than a singic, 
wsolated flake. Low-sntensty surveys (those with wide spacing) tend to bias result- 
ing archacological samples in favor of more obtrusive remains (Schuller and Wells 
1982). 


Difficulty of access, acommon problem m many regions of the western United 
States, mght mean that samples are biased agaist difficult-to-reach regsons. In 
regions with relatively few access roads, for example, sampling units mght bx 
placed with the restriction that units be within some maxemum distance of an 
existing road. E ven when it 1s possible to arrive at hard-to-reach places, the muted 
amount of tume left mm the day after travel might lower the quality of resulting 
survey in those regions. Private land ownership presents sumilar difficulties when 
landowners refuse access (Schiffer and Gumerman 1977:187). Indeed, m western 
regions, where most archacological survey work tends to be conducted on federal or 
state lands, the lack of comparable site data from private properties presents a 
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severe source of bias to regyonal archacologscal data bases, because private property 
often includes some of the best agncultural lands as well as the best areas for 
hunting and plant collecting, and prehistorcally, a 1s these very places that often 
were the most critical to site placement. 


Vanable archacological visibility, due primarily to vegetation cover, mtro- 
duces another major source of potential bias. Planted fields, swamps, or forests 
mught offer poor visibility and low archacolegical discovery rates, while desert 
regions or sagebrush-grassland settings usually offer high visibility and excellent 
site discovery rates (Schifier and Gumerman 1977:187). Study regions contamuing 
zones with markedly ditierent levels of visibility are likely to have existing site data 
bases biased toward the more visible zones. 


Perhaps one of the principal weaknesses of existing data bases us that the sum 
total of previous work mm a given region constitutes an unplanned effort. In other 
words, strong locational biases typically exist om the areas that have been field 
inspected within a reguon. For example, carly work often was conducted only at the 
most accessible and visible sites, while much contemporary survey 1s conducted 
promarily i areas of planned development. Thus, existing site data may be strongly 


biased toward certain types of settings and may not constitute a representative 
sample of sites within a regson. 

An additional problem is that sites might not be accurately located on maps. 
For modeling approaches that focus on the specific locations of sites, accurate 
placement of sites on maps 1s of critecal remportance since characteristics of the actual 
locations, such as environmental properties, are often used as a bases for modeling 
In actual field practice st 1s often difficult to locate oneself precisely, particularly in 
forested areas with few nearby landmarks. Field crews often get lost or misread 
maps. Moreover, carly archacological surveys often did not have access to good 
maps and offered only verbal descriptions, directions, and rough locational sketch 
maps. 

This problem 1s further compounded as site locations are transferred from map 
to map. In examining existing site files for one Bureau of Land Management (BLM) 
study, | found that the original site forms were available as well as the district's 
master management maps. The latter are a set of maps that can be found mm any 
regional BLM office and contain the most up-to-date information on the locations of 
all known sites and field-inspected regions. In this BLM district, the majority of the 
sites were extremely small lithic scatters (essentially points on maps). When the site 
forms, which included copies of original maps, were compared with the BLM's 
master maps, many sites were found to have been muslocated when they were 
copied from the orginal to the master maps (Figure 7.2). In fact, almost 10 percent 
were mislocated by more than 100 m (one-sixth of an inch on 1:24,000 scale maps), 


and several were even placed on the wrong dramage! 
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PROCEDURES FOR REDUCING DEFICIENCIES AND BIASES 
IN EXISTING DATA 


A number of problems with existing data bases were presented in the previous 
section. In order tor researches to use such data m archacological model develop- 
ment they need to ehmunate data of questionable quality and to reduce the effects of 
apparent biases 


It possible, the original site torms should be obtained mm order to assess the 
Qualit y of the mutual site-recording eflort and to clmunate secondary sources of error 
that might be introduced by later handling of the data by other mvestigators (as in 
the example discussed above). Certain minimal standards might be established; 
precise location of the site on a USGS 7.5-minute map might be required, tor 
example, along with a description of some mimmmal amount of archacologrcal 
evidence. Sites not meeting these standards might be elminated at this stage 


When a pool of minmmum-quality sites has been obtained based on mspection 
of site torms, tt would be prudent, depending on available funds, to examine in the 
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field a random sampic of the sates recorded by each major investigator mm the area. 
This practice would allow verification of locamonal accuracy on maps as well as 
assessments of suite content and function. It would also be worthwhile to resurvey at 
hagh intensity reguoms that hawe been field inspected by other researchers m order 
to obtain data on site discovery rates. These rates might then be used as a means of 
bias correction through the subsamphing or weighted analysis techmaue: descnbed 
below. 


When use of existing data mm ste-location model development 1s consdered, 
bias must be wewed mm terms of current modeling goals. For example, a survey 
conducted tor the discovery of only Paleoundian sites us not relevant to a site- 
location model tor Puebloan villages. Sumularly, a survey conducted m pine torests 
does wot bear on models tor grassland settings. 


The nature of bias also must be conndered in terms of the type of modeling 
approach used. Models that examune characternstics observed at the actual locations 
of sites or models that use a stmall-suze quadrat approach wm which characteristics of 
quadrats with stes are examened are particularly sensitive to the happenstance 
locational biases of previous surveys. For example, of 60 percent of one part of a 
study region has been field surveved but only 20 percent of another part, indiscrum- 
mate use of the ste data without regard to these survey proportions can bias a 
resulting model toward characteristics of the more extensively surveved zones. On 
the other hand, modeling approaches that partit.on a region into discrete coiego- 
nes, such as environmental communities, and then project site densities un each 
community ate less sensitive to thas tactor. In this approach, if one communrty has 
been 20 percent surveved but another 60 percent, so much the better for the latter 
community, since the resulting estimates of site density would presumably be more 
tehable because they are based on more wiiimation. 


Two major approaches might be imvestigated as a means of reducing the 
influence of known biases un exestung data. Suh:ampling attempts to reduce biases by 
undersampling areas that have been extensively examined and by oversampling 
ateas that, by comparison, have been littl examined. This procedure usually 
requires that some information be disregarded during the model-building process, 
but st should be reahzed that the sites clmuinated during this part of the project 
might be reserved to provide independent tests of locatronal models at some later 
pout. 4 nghted analy, on the other hand, permit retention of all information, but 
the umpact of an individual case (¢.g., a ute) on the analyses can be weighted by, for 
example, the relative umportance of Mise ase relative to other cases (see below, 


Subsampling 


A common problem im existong site-file data bases 1s unequal survey coverage 
iM Various fegions of a study area; these mequalities are a result of the use of 


nonprobatulste designs and purposive survey that 1s commonly required for 
various forms of cultural resource clearance. Early surveys typically examined only 
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the most ideal or most easily accessible regions, simply for purposes of site discov- 
ery. Unequal survey coverage also occurs among different archaeologists or projects 
owing to variation in crew spacing, vegetation cover, and other factors. A goal of 
subsampling 1s to obtain a subset of the total number of sites availabic in the entire 
study area such that many of the regional biasing factors are reduced in the final 
subset. A number of approaches might be used to accomplish this goal. 


One approach that helps to reduce the effects of unequal amounts of survey in 
different regions of a study area is to divide the area into discrete categories, such as 
environmental communities, and then to sample each category in a way that will 
correct for the inequities. A hypothetical study area containing three communities 
is portrayed in Figure 7.3a. Forty percent of community A has been field inspected, 
20 percent of community B, and 60 percent of community C. In developing a model 
for the entire study area it is important to remove the biasing effects of the more 
heavily surveyed communities. This might be accomplished by selecting 100 
percent of the sites in stratum B for the analysis sample and taking a simple random 
sample of 50 percent of the sites in stratum A and 33 percent of the sites in stratum 
C; this would yield a 20 percent overall sample of sites in the study area. 


Another subsampling approach attempts to provide an analysis sample with a 
more uniform distribution of sites from within a study area. It is important to 
attempt to obtain a regional sample that is well distributed across the area of study 
in order to ensure that site location variation from throughout the entire region ts 
included in the sample. In this approach a grid may be superimposed over the study 
area or over each stratum in the study area(Figure 7.3b). Depending on the size and 
nature of the study area the grid might be as large as a township (6 by 6 mi) or as 
small as a hectare (100 by 100 m). The analysis sample for the gridded study area or 
gridded stratum is selected by choosing sites from within each grid unit, which 
creates a more uniformly distributed sample. For example, let us assume that the 
gridded region in Figure 7.3b is a portion of environmental community C in Figure 
7.3a. A simple random sample of 33 percent of all the sites in Figure 7.3b could, by 
chance, cause some of the gridded cells that contain sites to contribute no sites to 
the sample and others to contribute many. If a 33 percent simple random sample of 
the sites within each grid cell were taken instead, this would help to ensure a 
better-distributed analysis sample. 


A third subsampling approach may be used when large clusters of sites exist in 
a data base. Clusters of sites can have adverse effects on later analyses because the 
clustered sites may have highly related characteristics rather than offering new and 
independent information. A field-inspected region containing a single cluster of 
many sites along with a number of dispersed sites 1s portrayed in Figure 7.3c. Ifa 
subsample of 20 percent of all sites in the region were randomly selected for an 
analysis sample, it is likely that ail or almost all of the selected sites might be from 
the single cluster. Yet, multiple sites from the same cluster might yield much 
redundant locational information, and it might be desirable to incorporate the 
locational variation of sites outside the cluster into the sample when the goal 1s a 
regionwide model and most of the region of concern is outside the cluster. This can 
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Figure 7.3. Illustrations for bias correction procedures. (A) Three environmental communities with unequal amounts of survey coverage: 40 percent of 
community A has been surveyed, 20 percent of community B, and 60 percent of community C. (B) A grid superimposed on a region to allow better-dist ributed 
samples by selecting sites from each grid cell (dots represent sites). (C) A surveved region (dark area) contaming a cluster of many sites. The small rectangle 
represents a “cluster” stratum 
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be accomplished by stratifying the area into a cluster region and a noncluster region 
(Figure 7.3c) and taking a simple random sample of 20 percent of the sites in each 
regpon. 

It might even be desirable, under certain circumstances, to reduce the influ- 
ence of major clusters still further. This could be accomplished, for example, by 
taking a larger sample of sites outside denoted clusters (¢.g., 30 percent) and a 
smaller sample of sites within clusters (¢.g., 15 percent). The goal might be to 
develop a model that performs well for the portion of a study region that lies outside 
clusters. This would be particularly useful where previous investigation has shown 
that sites from major clusters tend to possess locational properties different from 
those of sites outside clusters. By taking a smaller sample of sites from clusters, one 
can reduce the influence of those sites in an analysis. On the other hand, the very 
presence of clustering can be indicative of desirable locations that need to be 
included in a sample. Hence, some thought should be given to the goals of the 
analysis and to the behavioral implications of such patterns when one is using 
clustered data. The presence of significant clustering can be determined through 
sample statistical tests described by Clark and Evans (1954), Dacey (1973), and 
Thomas (1971:41-43). 


Weighted Analysis 


Weighted analyses can present an alternative to the elimination of data when 
existing site information is used for model development. Individual cases or sites can 
be assigned a weight that affects the influence of that site in subsequent analyses. 
Sites with more “important” location information (e.g., those that lie in undersur- 
veyed regions) can be assigned more weight, and sites with less important location 
information (e.g., from well-surveyed regions or from major site clusters) can be 
assigned less weight. In this manner it is possible to utilize information from all or 
most of the sites, while correcting for certain biases at the same time. 


Common statistical analysis computer programs, such as the Statistical Pack- 
age for the Social Sciences (SPSS 1983), the Statistical Analysis System (SAS Institute 
1985), and BMDP Statistical Software (Dixon et al. 1983), allow case weighting as an 
option for many procedures. A weighted sample mean is given by 





and a weighted variance by 
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where x; 1s the sample value for the i” case (site), and a; 1s the weight associated 
with that case. Note that if 2; = | for all 1 cases, these equations reduce to the 
traditional formulas for mean and variance. 


To illustrate how these formulas might be applied, the first problem area of the 
previous section, environmental communities with disproportionate areas of survey 
(Figure 7.3a), will be examined. The hypothetical region contains three communi- 
ties, A, B, and C, of which 40, 20, and 60 percent, respectively, have been surveyed. 
Suppose, for simplicity, that 4 sites were found in zone A, 3 in zone B, and 6 in zone 
C, for a total of 13 known sites (Table 7.1). The subsampling approach described in 
the previous section called for selecting half of the zone A sites and a third of the 
zone C sites, which would provide an approximated overall 20 percent: _ie 
consisting of only seven sites. The weighting approach merely assigns weights to all 
of the cases such that a site’s contribution is inversely proportional to the percen- 
tage of area that has been surveyed (Table 7.1). Thus, a site in zone B (of which only 
20 percent has been surveyed) carries twice as much weight as a zone A site (of 
which 40 percent has been surveyed) and three times as much weight as a zone C 
site (of which 60 percent has been surveyed). 


In conducting a site-location analysis encompassing multiple regions, as in 
Table 7.1 and Figure 7.3a, weighting can permit the archaeologist to emphasize 
features peculiar to undersurveyed regions. For example, let us say that zone B 








TABLE 7.1. 
Example of weights applied to data as a means of bias correction 
Site Slope Distance to Water Stratum Percent Surveyed Waght (@,) 
! 2 %) A “0 92% 
2 6 70 A at 9286 
3 ) 20 A 40 92% 
4 | ”) A “0 9286 
5 12 180 B 20 1.8571 
6 18 1%) B 20 1.8571 
7 15 Pal B 20 1.8571 
» 0 @ c oe” 6190 
+ 2 100 Cc wo 619 
10 ! * c oe 6190 
i 4 X < @ 6190 
12 3 120 ( @ 6190 
13 | » © wo 6190 
Normal weighting (7, = |) 
Slope Distance to Water 
r - 5.23 115. 38 
- 14 85RD 10,110. 257 
Stratum weighting (using equations given mm text) 
. - 78 153. $7 
3 ° 25. 2627 14,877. 342 
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contains more variable terrain in the form of hills and mdges and fewer sources of 
water than zones A and C. Hypothetical measurements of siope and distance to 
nearest water given in Table 7.1 show that, without weighting (i.c., | = 1), the 
measurements from the sites in the more heavily surveyed zones A and C dominate, 
yielding a mean slope of only 5.23 and a mean distance to nearest water of 115.38. 
When weights giving increased influence to the zone B data are used, however, the 
weighted mean values exhibit greater slopes (a mean of 7.81) and distances to water 
(a mean of 153.57), reflecting the greater steepness of hillslopes and the paucity of 
water in zone B. 


The utility of case weighting is not restricted to altering regional survey 
coverage bias; this procedure can be applied to other sources of bias as well. If 
reliable estimates can be made of site discovery rates under different types of 
vegetation cover, the discovered sites in zones offering less visibility might be given 
greater weight in analysis. A similar approach could potentially be applied to correct 
for differences in site discovery rates between different archacologists or projects. 
The use of weighting to correct any of these forms of bias should be carried out only 
after thorough consideration of the available evidence, however. 


Finally, it is important to note that not only can weighted means and variances 
be computed, but also covariances, which open the doors to the host of multivariate 
procedures discussed in Chapters 5 and 8. 


EVALUATION OF SITE-LOCATION PATTERNING AND 
MODEL BUILDING WITH EXISTING DATA 


When this stage is reached it must be assumed that the researcher believes he 
or she has a reasonably good sample of existing sites with which to work. The data 
might exist in several groups, each corresponding to a different site type. The 
investigator must decide on the kinds of phenomena that should be investigated for 
possible relationships with the locations of sites and then devise ways to make these 
phenomena operational. In other words, the variables that are to be investigated 
must be defined. An overview of some of the variables commonly used in site- 
location research and of the ways in which they can be made operational is given in 
Chapter 8. Once the variables are defined, they must be measured or observed on 
maps at each of the sample site locations, ether by hand (Chapter 8), through 
remote-sensing techniques (Chapter 9), or through computer technology using, 
geographic information systems (Chapter 10). 

A usual step in the model-building process (¢.g., Larralde and Chandler 1981; 
Thomas and Bettinger 1976) is to examine the data at this pomt through use of 
histograms, descriptive statistics, or simple univariate statistical procedures. In this 
way it is possible to identify variables that are more likely and less likely to have 
some bearing on the locations of sites in general or of individual site types. 
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The empirical data can then be subjected to a vanety of modeling approaches 
ranging from simple mathematical rules to multivariate statistical techniques. A 
single-class classifier approach (Lin and Minter 1976; Thomas and Bettinger 1976) 
can be used to model the distribution of individual site classes, or a control-group 
approach consisting of background environment measurements at locations where 
sites are absent might be used to contrast locations where there are no sites with the 
locations of known sites using a variety of quantitative classification techniques. A 
wide range of approaches using a variety of techniques is illustrated in Chapter 8. As 
noted in that chapter, any form of decision rule may properly be used to develop a 
modeling procedure for classifying locations —for example, as site-likely, site-type- 
likely, or site-unlikely locations. Admittedly, some procedures work better than 
others, and statistical procedures generally work best when the required assump- 
tions are fully met. Once a modeling procedure is developed, however, its perfor- 
mance must be assessed using statistical theory, an independent sample of data (of 
the kind of site being investigated and from the region being modeled), and a 
sample that can be argued to be representative of the sites in the region. 


Assessing a Model and Determining Additional Data Needs 


A fundamental question that must be asked when evaluating a model based on 
existing data is whether or not the model might be biased. Even if a developed 
model successfully predicts locational patterns similar to patterns exhibited in the 
existing site data base, how certain can we be that the existing site data patterns are 
representative of the locational patterns of as-yet-undiscovered sites in unsurveyed 
regions? Despite careful data evaluation and crude attempts at bias removal, it 1s 
possible that the bulk of existing sites really are not representative of sites in the 
general study area, and there is no way to determine whether or not this is the case 
unless some form of data known to be representative of sites in the region at large 
are obtained with which to test the model. 


An initial and simple test of model performance may be obtained simply by 
applying the model to the same data used to build the model. Although at best this 
procedure yields an inflated view of the model's true performance, it can provide an 
immediate indication of model deficiencies. The predictions of site locations made 
by the model might be categorized along several dimensions to assess performance 
in a number of areas (Table 7.2; see Chapter 8 for a discussion of the necessity for 
reviewing model predictions of site-absent locations or nonsites as well). For 
example, the model might be examined to see how well it predicts various func- 
tional or temporal site types or various subtypes of sites (the columns in Table 7.2). 
Similarly, the performance of the model relative to different environmental set- 
tings, such as various plant communities or topographical situations, mght be 
assessed (the rows in Table 7.2). Deficiencies at this stage should be taken seriously; 
if they exist here they certainly will exist when the model is applied to independent 
and new samples. 
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TABLE 7.2. 


Assessing model performance along several sue-type and environmental categories. In testing a 
site-location model, the percemage of correct model predictions for cach ste type are assessed 
along the columns end the percentage of correct model predictions for cach environmental 
category are assessea long the rows. 


Sate Type 1 Sate Tipe 2 Sate Tye t 





Environmental | 
Category A 





Environmental 
Category B 





Environmental 
Category ( 




















Percent Correct 


Sate Predu trom 





Environmental 
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Model tests that are more independent and can yield a truer picture of actual 
model performance may also be performed using exssting data. One independent 
test uses sites in the existing data base that were not used to construct the 
model —entries that were eliminated during attempts at bias removal, for example. 
Such sites represent independent information, and the model can be applied as 
shown in Table 7.2. A somewhat better approach is called split sampling (Mosteller 
and Tukey 1977:38); with this procedure the analysis data is randomly split into two 
groups, a model is built with one half, and the remaining dat « are used to provide an 
independent test of the model. The jackknife proced «re (Mosteller and Tukey 
1977: 133) presents yet another alternative. In this procedure one case in the analysis 
data set 1s temporarily “thrown out,” the remaining data are used to build a model, 
and the single case 1s used to test the model. This process is repeated using each case 
in turn to yield an independent assessment of model performance. (Split sampling 
and jackknifing are discussed in more detail in Chapter 8.) 


Since existing data often are highly clustered (Figure 7.3c), the traditional 
split-sampling and jackknife approaches still might yield an inflated picture of 
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model performance. A site that 1s part of a cluster of sites might exhibst characters- 
tics that are highly related to the other sites in the cluster. When that site 1s used in 
a split-sample or yackknufe procedure, it does not necessarily yield an independent 
test since its characteristics are related to those of other sites, some of which may 
have been used to develop the model. An alternative that might offer less inflated 
results is to superimpose a large grid, like chat shown in Figure 7.3b, over the region 
and to use the grid cells as the basis for the split-sample or jackknife techmiques. For 
split sampling the individual cells are split at random into two groups, and the 
analysis proceeds with sites in the selected half of the grid cells while sites in the 

ining half are reserved for model testing. In the jackknife approach the sites in 
the & grid cell are eliminated from the 4 model and are then used to test that 
model independently, with this process repeated for all é cells. 


Such testing procedures, however, are only as good as the data to which they 
are applied, and as mentioned carher, existing data might inherently be strongly 
biased. Independent and representative data are therefore needed if we are to assess 
model performance in a reliable and confident manner. 


in many federally administered regions and districts some form of random 
sampling survey may well have been conducted in the past. These data can be used 
for model testing if it can be argued that they are representative of sites (or the site 
type of interest) in the whole region and if the site sample was suitably constructed 
and sufficiently large. Not only can these data be used to assess accuracy (Table 7.2), 
but statistical significance can also be determined and confidence limits around the 
predictions can be calculated. Since the width of a confidence mterval is directly a 
function of sample size, relatively large test samples are desirable. For example, if a 
model accuracy rate of 80 percent correct 1s obtained, a sample size of 50 yields 2% 
percent confidence interval width of 19.8 percent (+9.9 percent), a sample size of 
100 yrelds an interval width of 15.5 percent (+7.8 percent), and a sample size of 200 
yields an interval width of 11.0 percent (+5.5 percent; Hord and Brooner 1976). 


Collecting and Integrating New Data in Model Development 


During model testing through use of existing data or through use of independ- 
ent test data it might be discovered that a model underperforms for certain types of 
sites. Alternatively, a model might perform poorly when applied to certain environ- 
mental settings —grassland settings, for example (Table 7.2). 


In order to attempt to remedy these failings, the researcher might go back to 
the existing data base (especially if it contains a number of sites eliminated from the 
analysis through subsampling or for other reasons) and, using the above examples, 
attempt to incorporate more grassland sites or more sites of the type being 
underpredicted. If a weighted analysis approach us being used, the investigator 
might simpry assign more weight to site types or sites in environmental settings 
that are being poorly modeled. The model-building and model-assessment stages 
then might be repeated. 
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Another approach to remedying modeling problems 1s to develop a specif 
model for the particular environmental setting or site type that 1s being incorrectly 
predicted (Stone 1984). This tactic might be more successful than refinement of the 
orginal model, since a site-type or environmentally specific model would only focus 
on the locational variation exhibited by the particular setting or site type. It should 
be noted, however, that when analyses become too fine-grained, as when specific 
site types or environmental communrties are investigated, available sample sizes 
can become prohibitively small. 


A last alternative when one is faced with the problems of under- or overpredic- 
tion by a site-location model 1s to conduct a new survey designed to obtain more 
data from deficiently predicted environmental regions or site types. This ts a last 
resort, due to costs, and should be performed only when the researcher 1s certain 
that the modeling application warrants collection of new data. It might be that it 1s 
not possible to model the locations of sttes in a specific environmental community 
successfully (owing to a low level of patterning with respect to the variables 


examined, for example) regardless of the amount of data available. The collection of 


new data in this case would not offer any improvement to the modeling situation. 
Before initiating a new survey the investigator should consider this possibility by 
examining the quality and amount of the existing data. 


When implementing a survey for the purpose of providing more information 
about a particular region, such as a specific environmental community, some form of 
random sampling design should be used. Sites discovered by this survey could then 
be compared with previously known sites mm the same community. This comparison 
can entail visual inspection of the shapes of histograms of the measured variables, 
descriptive statistics, indices of difference, and statistical tests for differences, such 
as the ¢-test. If differences between the samples are found, this would suggest that 
new and different information might be contained in the new sample. The new data 
might then be incorporated into the analysis data base or analyzed as a separate data 
base, and the model-building and testing processes could be remitiated. 


New site data inevitably become available as archacological work continues im 
a region. Model updating and testing using these new data can be performed as an 
ongoing process. The techmiques used im evaluating existing data should also be 
apphed to these new data; 1.c., the quality of site recording and survey should be 
investigated, and appropriate bias-removing techniques, such as subsampling to 
reduce locational survey bias, should be emploved. 


EXAMPLE ANALYSIS 


A settlement pattern study of Mesolithic sites in the Federal Republic of 
Germany (Kvamme and Jochim 1988; also see Kvamme 1986) will be described here 
as an example of the use of existing site data as a basis for locational modeling. 
Although this study does not illustrate many of the bias-reduction techmques 
discussed above, it does illustrate the locational patterning that can be found, and 
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the kinds of interpretations of results that can be made, given the biases that might 
exist in a body of regional archacological data. This study focused on a region near 
Stuttgart where there are many recorded Mesolithic sites. The journal Fandberu bre 
aus Schbwaben, which contains regional archaeological reports of investigations by 
local amateurs, was used to obtain the locations of 170 known Mesolithac sates un the 
region. Since the site descriptions were very terse st was not possible to assess 
quality of reporting, mor was it possible to field check any of the sites. The sites did, 
however, appear to offer a fairly good spatial distribution that was well spread 
throughout the 940 km? study area (Figure 7.4a). 


Previous research im the Mesolithic of northern Europe had suggested a 
number of relationships between the physical environment and patterns of settle- 
ment. Nine environmental variables were selected for this study (Kwamme and 
Jochim 1988), largely on the basis of previous work. These variables are elevation, 
slope, aspect, local rehef, a measure of view quality, a measure of shelter potential, 
horizontal distance to nearest water, vertical distance to water, and horizontal 
distance to nearest third-order stream (see Chapter & sor a discussion of how these 
variables can be defined). Measurements of each variable were made at the locations 
of the 170 known Mesolithic sites, and the same measurements were made at 100 m 
intervals across the entire background environment (a total of 84,000 measurements 
for each variable). The large number of measurements was possible owing to the use 
of computer-based geographic information system (GIS) techmiques (see Chapter 
10 for discussion of how the computer approximates measurements on ..1¢ basis of a 
regular grid system). 

The methodological premise of the study was that, in order to determine 
significant environmental patterning at site locations, one must contrast empirical 
data measured at known sites with the same data measured in the background 
environment. For example, if only the site locations were examined, as 1s usually the 
case, the data might indicate a major tendency for south-facing aspects. Such a 
tendency in the data could reflect a significant pattern, or conversely, the entire 
study region might generally possess a south-facing onentation, in which case the 
pattern exhibited by the sites would only be a reflection of the background 
environment; it 1s an examination of the background data that allows us to make 
thes assessment. For each variable, the data measured at the 170 sites were con- 
trasted with a representative sample of 3201 measurements taken from the back- 
ground environment using Student's /-statistics as a rough guide for diflerences 
between the two groups. Since the a pron chance of an as-yet-undiscovered 
Mesolithic site occurring in one of the background samples was assumed to be 
extremely low, the two classes could be argued to be reasonably distinct, although 


the representativeness of the Mesolithic sample and the general independence 
problem of spatial samples forced cautious interpretation of the statistical results. 


The analysis results (Table 7.3) mdicate a number of strong patterns of 
contrasts between site locations and the background environment (in the orginal 
study, detailed histograms were also examined). The sites show a strong tendency 
toward level ground slope (Figure /.4a), for regions of great rehet, and for higher 
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elevations, suggesting high-clevation ndge crests and the edges of plateau tops as 
the primary locus of site placement in the regson. Although there was no strong 
preference for onentation of aspect, the remamung vanables were supportive of the 
suggested pa’*ern. The sites possessed wider wews and lower values for shelter 
(reflected by a higher index in Table 7.3) than the background environment, which 
"s consistent with these high-pomt locations. Moreover, the results showed ta:rly 
strong tendencies for site location relatively far from water, also powting to ndges 
and plateau edges, which tend to be located far from water. 


A multivariate model of the Mesolithic site-lecational pattern was. veloped 
during this study, not for prediction purposes but mm order to assess the locational 
pattern in the known site sample further. A robust nonparametric discriminant 
function known as logistic regression (see Chapters 5 and 8) was used to develop the 
model, which supported the univanate findings. The model, m conpunction with 
the GIS, was used to mag the quantitative environmental pattern of site location 
over the remainder of the study region (1.¢., every 100 m) im order to provide a visual 
representation that summarizes the Mesolithec tendency (F gure 7.46; see Chapter 
10 tor a more detailed discussion of how this 1s accomplished). The mapped pattern 
also supported the uniwvanate findings of a tendency for sites to be located on ndge 
tops and the edges of plateau tops, considerable distances from dramages (compare 
Figures 74a and 7.4b). 


A number of cautious interpretations can be drawn from these empirical data 
(Kvarmme and Jochim 1988). Patterns in this nonrandom sample of sites might reflect 
Mesolithic locational preferences, modern collector biases, geological or other 
processes, or a combination of these factors. Geological processes might have 
wntroduced bias to the sample in a number of ways. Although the general patterns of 
landtorm and dramage im the study area have not changed since the Mesolithic, 
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alluvial deposition has occurred. If there are deeply burned sites mm these areas of 
depositior, the sample will be biased away from locations m valley floors. Eromon, 
on the other hand, mught have destroyed sites on steep slopes or along streams 
where meandering has occurred, thus biasng the sample away trom steep slopes 
and dramage locations. Another factor influencing site vissbility 1s modern land use. 
Matenals in plowed fields tend to have higher visibuhty than those m forested areas, 
whch biases the sample toward areas under cultivation, such as tver terraces, 
gentle slopes, and ndge and plateau tops. 

Geologx processes and modern land-use patterns have biased the efforts of 
modern collectors away from steep slopes and marshy valley bottoms and toward 
ateas under cultivation of nver terraces, gentle slopes, and ndge and plateau tops, 
and thas 1s indeed a pattern semular to that demonstrated by the site sample (Figure 
7.4b). The sates, however, exhibst a more restricted pattern m that they tend not to 
occur on fiver terraces or hill flanks, and they are found mainly on the edges of 
plateaus rather than on all portions of plateaus. Because the site distributson 1s more 
restricted than the pattern of areas mspected by the amateur collectors who 
reported the sites, Jochem and | have suggested that the observed distribution of 
sites appears to be partially the result of Mesolithac locational preferences (Kvamme 
and Jochem 1988). 

Interpretations of these patterns should also take ito account the nature of 
the archaeological sample. The sample used mm this study mcluded all Mesolithic 
sites recorded m the region regardless of function or season of occupation (factors 
that were unknown). Diflerent site types could, of course, have vaned locational 
requirements. As has been noted, 





The locatsonal pattern of such a mmed group of sites » dithoult to mterpret. In part « 
represents a blending of charactermtics specific to cach ete type and season, weeghted 
acoordung to thew proportional representation mm the sample Sence the site types and 
thew prepertions are mot currently known, « « not possible to separate those different 
specttic patterns. In the study, for example, wtes showed mo tendency te tace amy 
derectxon. tt may be, however, that eunter resdentul camps showed a tendency to tace 
south, ehule wtes of other seasons and functions had other characteret orent ations 
The med sample would obscure these separate patteras |Kvamme and Jochen (988) 


Based on the results of our research, however, we concluded that the overall 
pattern reflects environmental characteristics common to all sites and ail sets of 
activities and that mterpretation should emphasuze general advantages of such 
locations rather than those relevant only to certain seasons of specitic actrvities. In 
the region of study these advantages may have included (4) wide views allowing 
easy spotting of game and strangers m any season; (+) strong breezes proveding 
comfort m summer, reducing snow cover mm winter, and helping to keep away 
imsects; (¢) good dramage im every season; and (4) ght forests adapted to these 
exposed, dry situations, which may have offered case of travel, hunting, and 
burning. Large distances from water may reflect an avordance of nverie forests, the 
ummportance of nverme resources, of a major umportance of high elevations. The 
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tendency for level ground probably represents the preference for performing 
activities on level ground (Kvamme and Jochim 1988). 


In terms of the present volume, the multivariate model of the Mesolithic site 
pattern and its mapping (Figure 7.4b) can be viewed as a “predictive model” for 
Mesolithic sites based on existing data. The model remains untested, nowever, and 
its performance as a predictive tool cannot be evaluated until the model is applied to 
a sufficiently large, independent, and representative sample of Mesolithic sites from 
within the study region. At this point there is simply no way to determine whether 
the known site sample upon which the model is based is strongly biased (e.g., as a 
result of the unsystematic way that amateurs find sites or of geological processes), or 
indeed whether it is representative of the region’s Mesolithic pattern in general. 
Before the adequacy of the model could be assessed, some form of random sample 
survey would have to be conducted within the region, and a sufficiently large 
sample of Mesolithic sites would have to be discovered. The multivariate model of 
site location could then be applied to this new and representative sample, and the 
percentage of correctly predicted sites could be determined, along with statistical 
confidence limits around the prediction. 
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Chapter 8 


DEVELOPMENT AND TESTING OF QUANTITATIVE MODELS 


Kenneth L. Kvamme 


This chapter 1s about the application of methods of empirical analysis— 
mathematics, statistics, and computer-processing techniques — to the development 
and testing of models of archaeological distributions that have a predictive capacity. 
This chapter 1s written primarily for the archaeologist with a background in 
quantitative methods of data analysis who 1s contemplating the development and 
testing of archacological locational models. In order to appeal to a broader base of 
readers, the number of mathematical equations has been kept to a minimum, 
extensive descriptions of the various methods have been provided, and figures have 
been used to illustrate the techmques whenever possible. 


Past peoples left behind material evidence of their actions —the archacological 
record. This record is full of telltale patterns. Today we have access to a host of 
advanced tools for analyzing such empirical patterns: the tools of multivariate data 
analysis and the great analytical engine, the computer. We might hope to make 
some sense of the past by noting relationships within and among these data 
patterns. Using these touls I will describe in this chapter several paths toward 
developing and testing models of the patterns of prehistoric land use im a region. 


It should be noted at the outset that formulation of rigorous models through a 
priori deduction of underlying causal processes is a laudable goal. We must temper 
this goal, however, with a practical outcome. The social disciplines presently lack a 
broad theoretical base, and therefore deductively based modeling strategies typi- 
cally have little foundation. Haining (1981:88) has observed in geography, for 
example, that 


most geographers have had a preference tor data analysis rather than mgorous model 
termation through prot specification of the underlying process. In Brit am this tendency 
paralle!s the growmng mterest im problems of regronal forecasting. The emergence of this 
mterest im the 1970s 1s on part the result of the discipline’s new quest tor “relevance” at a 
pobev level. As a research goal it elevates the methods of data analysis over those of 
rigorous model formulation through the need to provide answers to difficult and often 
mberently messy problems. Only the swmplest spatial processes are capable at the 
present time of bemg given a ngorous formulation and there us a tendency tor them to 
seem trivial and unrealistic when set against the expansive problems of predicting 
regional unemployment levels and forecasting the space-time evolutron of epidemics 


325 





KV AMME 


326 


The analogy with the archaeological problem of this volume 1s clear. Like geog- 
raphers, archaeologists have a “messy” and expansive problem— modeling regional 
archacological distributions. Like geographers, we can apply the methods of empir- 
ical data analysis to this problem of regional forecasting because these models are 
able to produce nontrivial results that can be used in applied, real-world contexts 
(e.g., Custer et al. 1986; Kvamme 1986; Kvamme and Jochim 1988; Larralde and 
Chandler 1981; Parker 1985; Scholtz 1981). This chapter focuses on these data- 
analysis modeling approaches. 

The unit of analysis in this chapter 1s the /ocation, or land parcel. Treating the 
land parcel as the unit of investigation allows greater freedom in the definition of the 
dependent variable used in analysis (Carr 1985:116). At the very simplest level, a 
binary dependent variable can be defined and coded according to whether an 
archacological site is present of 1s not present in a particular parcel (and it can be left 
up to the researcher to define what constitutes an archaeological site). Some 
investigators (¢.g., Dunnell and Dancey 1983) argue against use of the site concept, 
pointing out that the term «z+ typically refers only to clusters of artifacts, a mere 
subset of the archacological record. By using the land parcel as a focus the researcher 
can define virtually any archacological manifestation of potential interest as the 
dependent variable. Other examples of dependent variable categories include 
parcels with 20 or more artifacts of any kind vs parcels with less than 20 artifacts, 
parcels with 10 or more sherds vs parcels with less than 10 sherds, or parcels with any 
cultural manifestation vs parcels without prehistoric evidence. Note that more than 
two categories also are appropriate, allowing imvestigation of multiple site or 
functional land-parcel types simultaneously (¢.g., settlement, temporary camp, kill 
site, other archacological evidence, no archaeological evidence). By using the land 
parcel we are able to examine various environmental, social, or other characteristics 
of the parcels that are coded as having archacological manifestations or specific 
types of manifestations, as opposed to parcels that contain little or no archacological 
evidence. An additional benefit of using the land parcel is that the size of the parcel 
controls the scale of investigations: very small parcels allow investigation of 
microenvironmental and other small-scale influences on archacological distribu- 
tions and potentially allow greater detail and precision in modeling; large parcels 
allow similar pursuits but on a grosser scale. (Note that if small parcels are used, and 
large archacological sites or scatters are present, then contiguous parcels may be 
coded as “site” or “scatter” present.) In the following pages, discussion principally 
focuses on the simplest two-category situation for the dependent variable, for case 
and clarity of presentation. All of the methods generally apply, of course, to 
situations in which any number of categories are being used. Since archacologists 
traditionally have used the site concept, | use the term sie in a general sense to refer 
to land parcels possessing the archaeological manifestations of interest, however 
defined. Similarly, the term nomute is used to refer to land parcels that do not meet 
the definition of the archaeological manifestations. 


The phrase predictive archacologwal model, which has recently come into usage, 1s 
somewhat misleading because most data analysis approaches do not really predict 
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where as yet undiscovered sites are specifically located. Instead, data analysis 
approaches attempt to abstract the locational pattern exhibited by a sample of 
site-present locations (or specific site-type locations) im a region im terms of 
environmental, cultural, or other variables, and then to project this pattern over the 
entire region (using various computer mapping techniques, if available; see Chapter 
10). If the initial sample of site locations from which the model is abstracted exhibits 
a locational pattern similar to that of the remainder of the region’s sites (i.c., if the 
sample 1s a representative or random sample), and the sites are strongly patterned, 
then the mapping of the model can provide a very good indication of where sites will 
be found in the rest of the region. Thus, we do not predict the locations of 
undiscovered sites; we merely map locations that possess environmental or other 
characteristics that are similar to those of the initial site sample. 


The nature of this mapping or extrapolation of an archacological locational 
pattern depends primarily on the quality and type of modeling approach used. The 
mapping might correspond with simple environmental categories, such as plant 
communities (Figure 8.1a), or it might plot a complex multivariate function of a 
variety of factors with estimates of site sensitivity every 30 m across the region 
(Figure 8.1b). These products of empirical data-analy sis models (Figure 8.1) should 
include performance indications—statistics that describe how well (¢.g., how 
accurately) the model and resulting map portray the locations of sites. 


It should be emphasized that the ability to predict locations (land parcels) 
where archacological sites are likely to be located logically implies the ability to 
predict where sites are not likely to be found. Without this ability the modeling 
exercise becomes meaningless. It 1s easy to develop a model, for example, that 
predicts the locations of all sites within a region with 100 percent accuracy; such a 
model would simply classify every location (1.¢., every land parcel) within the region 
as likely to contain sites. Of course, nothing 1s gained from such a model. The 
usefulness of a model must be judged not only by how well it predicts locations 
likely to contain sites but also by how well it predicts locations unlikely to contain 
sites. If a model is able to predict 90 percent of the site locations correctly in a region 
representing only 50 percent of the total land area (as opposed to 90 percent of the 
land area), then something is gained. 


Many of the locational modeling approaches discussed in this chapter make use 
of basic pattern-recognition principles and techniques (Duda and Hart 1973). 
Predictive archacological models developed within this perspective must work if 
two assumptions can be met. The first assumption requires that the locational 
patterns exhibited by the initial site (or site-type) sample used to “train” the 
pattern classifier (the quantitative model) are reasonably representative of the site 
population under study. The second assumption is that the site locations are 
nonrandomly distributed with respect to the environmental or social factors under 
investigation. Use of some form of random sampling designs (Mueller 1975) will 
usually ensure that the requirements of the first assumption are met. With regard to 
the second assumption, it 1s a basic premise of modern archaeology that human 
behavior is patterned, and the investigator's familiarity with the region or with 
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Figure 8.1. tnd products of cultural resource modeling. (A) A simple plant community mapping mm which the 
communities correspond to diflerent site densities (after Plog 198364). (B) A “site probability surface” superumposed on a map 
and denved from a complex multivariate function of six vanables measured mn cach 90 by 90 m cell (after Kvamme 1980). 
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settlement data in general will usually guarantee that some of the variables selected 
will reflect this nonrandom behavior. When I indicate that such models “must 
work,” I mean that there must be some gam (e.g., in terms of percent correct 
predictions) over a purely random model with no predictive capacity. 


We might define the gain concept more ngorously for purposes of this chapter. 
It was stated above that the results of archacological locational models should be 
mappable within the region under study. When the model is mapped (¢.g., Figure 
8.1), certain areas of the region are indicated as being more likely to contain sites 
than other areas. Only a percentage of all the sites (or of the site type under 
investigation) in the entire region will occur within the areas indicated on the map. 
If the area likely to contain sites 1s small (relative to the total area of the region) and 
if the sites found in that area represent a large percentage of the total sites mm the 
region, then we have a fairly good model of site location. On the other hand, if the 
area predicted to contain sites is a relatively large portion of the total area and the 
percentage of sites within that area is not significantly greater than the percentage 
of regional coverage, then the model 1s not very useful. Based on these considera- 
tions we might explicitly define gat as 





Gain <1 - { Percentage of total area covered by model 





percentage of total sites within model area 


As gain approaches 1, the model has increased predictive utility; if it 1s near or 
approximately 0, then the model has little or no predictive utility. If gain 1s negative 
(<0), then the model has reverse predictive utility (1.c., a greater density of sites 
occurs outside the area specified by the model). Such a model could still be of some 
use if the area outside that specified by the model were subsequently considered to 
be the area being modeled (but the model developer should be fired!). 


The gaim statistic 1s used throughout this chapter as a means of comparing 
models. Most archacological modelers tend to focus on percent correct predictions 
for sites, for nonsites, or even an overall percent correct statistic (see Chapter 3). 
These statistics can be useful and important, but they can also lead to serous 
misinterpretations. In addition, they offer little basis for comparisons between 
models (these issues are discussed in detail below), while the gain statistic pre- 
sented here 1s easy to interpret and facilitates comparison. 


An important consideration that must be addressed before model develop- 
ment 1s discussed 1s exactly what types of sites or archacological manifestations are 
to be modeled. A central assumption in archacology is that the locations of sites of 
different functional categories or chronological periods will represent responses to 
different situational contexts, such as environmental circumstances. It 1s umportant, 


therefore, to develop models for specific archacological types whenever possible. 


In practice, specific site-type models are often difficult to establish for several 
reasons. The problems he not in the modeling techniques but in the definition of 
meaningful and justifiable site types, in assigning sites to the types based on limited 
and often questionable evidence, and in acquiring sufficiently large samples of the 
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types for subsequent analysis. The practice of assigning sites to functional types on 
the basis of surface information or limited excavation data 1s often questionable. In 
many regions, particularly where surface evidence consists of only a handful of 
lithacs, the investigator may be relying on the fimsiest of evidence (if any) and on 
sheer guesswork. Although sites may be forced into type categomes under certain 
circumstances, the quality of the resultant groups and thei utility for subsequent 
analysis must be questioned. In other words, meaningless site types will yield 
meaningless analysis results. 


A second difficulty involves categorization of sites into many site type groups, 
a procedure that can mtroduce sample-size problems. On the other hand, even 
when only a few site-type categones are emploved, certain types within a region, 
such as major village centers or Paleowndian sites, might mberently exist only m 
small numbers. Since locational models derived from empirical data require rela- 
tively large samples in order to define a locational pattern successfully and exirapo- 
late st to a larger regson, functional or temporal types containing few cases simply 
cannot be modeled. In general, empirical models can be developed only for the few 
types that contam a significant number of representative cases. Careful thought 
should be given to the nature of the available evidence and the reliability of 


resultant site types prior to subjecting the types to a modeling exercise. 


In other publications (K vamme 1983a, 1985a) I have suggested an alternative to 
the practical problem of making traditional temporal-functional site types opera- 
tional using regional survey data. Site types can be defined on the basis of amount of 
inferred activity occurring within a land parcel, rather than syper of wnferred activi- 
ties. The amount of activity is measured in terms of quantity and variety indices of 
observed artifacts at a location. Locational studies can then be carned out by 
comparing environmental characteristics among locations imdicating much prehis- 
tor activity, locations indicating little prehistoric activity, and locations indicating 
no prehistoric activity. This approach allows one to investigate why certain loca- 
tions were used im the past and why other locations were not. 


Historical site location model development poses problems similar to those 
encountered in prehistoric model development, but here additional problems arise. 
In most regrons the amount of time allotted for historical site model development is 
probably best spent researching historical documents and archives, which often 
indicate exactly where many types of historical sites are located (see Chapter 7). 
Moreover, the best predictor of historical site locations in many regions may be 
neither environmental phenomena nor the typically used cultural factors (such as 
distance to nearest road), but simply the cadastral survey grid, since patterns of 
settlement were often dictated by section and partial-section boundanes (c.g., 
Scholtz 1981:220). Thus os not to say that successful models for historical sires cannot 
be developed utilizing the usual environmental or other predictors. Scholtz (1981), 
for example, was able to construct a model for domestic historical site locations by 
correlating 15 environmental variables with the locations of known sites im a 
southern Arkansas region. Using a somewhat different approach, Monroe et al. 
(1980) developed powerful trend surface models for the spread of historical settle- 
ment in colomal Connecticut. 
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At the extreme, and depending on the quality of the regional data, models can 
be developed for the locations of all sites as a single group within a regson. This 
approach has been critacized (and m certain contexts mghtly so) because lumping 
sites of many different functional types and temporal periods into a single group 
imtroduces a great deal of variability to any analysis, making « more difficult to 
develop a successful cultural resource model. As we shall see m later sections, 
however, this variation usually 1s substantially less than the vanation present m the 
environment as a whole, and it 1s possible on the basis of a general model to define 
significant portions of a regson that are unlikely to contain sites of any kind. If we 
lump together all environmental and other variation measured at all site locations, 
the resultant characteristics might define an acterity pace (see Kvarmme 1985a), a 
subset of the whole environment within which the bulk of human activity (: 
from moving from one activity place to another) 1s performed. Although different 
functional activities might be conducted m entirely ditierent situational contexts 
within the activity space, the activity space can be a useful construct for locational 
modeling purposes if it 1s substantially smaller than the whole environmental range 
of a region. 

It should be recognized that the goals of cultural resource management may 
not always be consistent with traditional archacological perspectives. For example, 
cultural resource managers are often interested im regyonal models for the locations 
of all sites im general, simply because all sites are initially umportant from a 
management standpomt. Additionally, models tor traditional site types might not 
be as important as models for significant sites, where significance 1s defined as those 
sites being important to predefined regional research questions. 


In the following pages, site location models are often referred to im a general 
sense. Such statements shou!d not be taken to apply only to models for all sites as a 
single group, but also to models for specific types of sites, since the methods 
discussed are applicable to any class or classes of sites. 


Finally, since this chapter covers such a wide diversity of topics, three data sets 
are used to provide the best possible illustrations of the methods employed. The 
data sets are (4) a western Colorado data set from a mesa and canyon region known 
as Glade Park, used to illustrate model-building and model-testing procedures; (4) 
an eastern Colorado plains data set, used to compare different types of modeling 
approaches and their mappings; and (c) a Mesolithic data set trom the Federal 
Republic of Germany, used to illustrate modeling multiple archacological site 


classes. 


VARIABLES USED IN LOCATIONAL RESEARCH 


A researcher usually selects a variable for investigation in locational analyses 
because distributions of archacological phenomena are believed to have been 
somehow influenced by that variable. Hence, most researchers rely on the results of 
previous and similar studies in order to determine the variables to be used im an 


KVAMME 


332 


mvestigation. A multitude of perspectives have been applied im archacology to 
examine site locational information. Those that focus on the physical environment 


and its effect on settlement behavior occupy a major portion of the locational 
analysis lterature. The examination of site catchments, topography, vegetation, 
and other environmental features are major elements of this approach. Roper 
(1979a) has labeled analyses in this perspective the study of man-land relationships, 
as opposed to man-man relationships. The latter term refers to analyses that assess 
the umportance of the human or social environment in structuring patterns of 
settlement. These analyses focus on such areas as central place theory, the rank-size 
rule, and population distributions over the landscape. Although man-man relation- 
ships play a major role mn the settlement pattern of modern industnalized society 
(Haggett et al. 1977) and otfer an important and useful perspective im many 
archacological apphcations (Flannery 1972; johnson 1977), many key features of this 
approach are meaningless in a large number of ar hacological situations. For 
example, in most hunter-gatherer contexts markets and central places are not 
meaningtul concepts. Moreover, the primary omentation of man-man approaches is 
the analysis of properties related to fixed settlements in space, again precluding 
mvestigation of much of prehistory (¢.g., many hunter-gatherer groups). In con- 
trast, man-land relatyonships are intimately related to site location decisions among 
hunter-gatherer groups (Bettinger 1980; Jochim 1976; Wood 1978), and they play a 
significant role in the settlement patterns of more complex societies (Green 1973; 
Grossman 1977; Hill 1971; Hudson 1969). An investigation of man-land relationships 
can contribute to our understanding of locational behavior regardless of cultural 
form, and this is why most work in site locational modeling has focused on 
environmental data. Another reason for this focus ss that environmental data are 
generally easier to acquire than social data. Although social factors undoubtedly 
influence settlement decisions in most cultural contexts, given the nature of the 
archacological record it 1s generally umpossible in any but the best understood and 
preserved archacological regions to reconstruct contemporaneity between sites, 
population structures, etc.—important requisites for mvestigating social pheno- 
mena. For this reason, social factors often cannot be examined as frequently as 


environmental factors im archacological locational studies. 


Archacologists have traditionally relied to an extraordinary degree on the use 
of nominal-level variables to describe phenomena uncer investigation im regional 
research. Examples include a focus on biotic com.aunities, soil classes, or the 
practice of classifying a region as “level” or “steep.” Landforms are often catego- 
rized into discrete types, such as nverine, arable, mesa top, mesa side, mesa bottom, 
and southern aspect (¢.g., Euler and Gumerman 1978; Gumerman 1971; Plog 1971; 
Plog and Hill 1971; Zarky 1976); indeed, interval-level data are sometimes rescaled 
to the nominal level. Yet most archaeological phenomena are eminently quantifia- 
ble. Geographicaliy distributed phenomena, particularly characteristics of the nat- 
ural environment, by their very nature are distributed im a continuous manner (and 
thus are potentially quantifiable). Slope, aspect, and distance to nearest drammage, 
for example, change continuously as one moves over the landscape. Likewise, so 
does vegetation diversity, density, and biomass, as well as soil pH and mean grain 
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size. The use of categoncal data and the practice of rescaling nterval-level mea- 
surements to the nominal level causes critical information to be discarded, reduces 
the power of subsequent analyses (since nommmal-level data contam less information 
than corresponding interval-level data), and precludes use of many powerful 
analytecal alternatives and research desgns. 


A major tocus of this chapter will be on the use of contunuously measured data 
in site locatson research, and emphasis 1s placed on the umportance of developing 
suitable measurement concepts. The types of phenomena typically mvestigated mn 
this research might convemently be grouped according to two major classes (see 
Plog 1971:47-48): currronmental tactors and sacral factors. The followimg discussion of 
a number of key vanables that have been frequently «<amined im site location 
studies 1s by no means an exhaustive summary. In any partecular region, some of the 
vanables mentioned may not be appropriate. 


Environmental Factors 


Landform and landiorm-related phenomena are commonly consdered m 
archacological studies A typical approach 1s to categorize the landscape into a series 
of nominal-level types, such as canyon, canyon floor, canyon side, cliff, mesa, plain, 
and slope (¢.g., Vivian et al. 1980) and to observe the distribution of archacological 
sites across these categories. Such categorization of continuous landscape forms, im 
addiutson to the problems outlined above, leads to problems of definition and tends 
toumply a definiteness about these categornes that may not be warranted (Robinove 
1981:240)—tor example, how does one consistently delineate boundanes around a 
construct such as an arroyo head? Additionally, class boundanes may be totally 
arbitrary; a line dividing level from steep locations depends on current definitions of 
what us level and what 1s steep. 


Steepness of ground us widely mvestigated in settlement studies because 
settlements typically are located on level surfaces where steep slopes do not 
interfere with activities (Judge 1973:133; Roper 1979b:77-81; Wilhams et al. 
1973:230). This concept 1s easily made operational as a quantitative variable mm a 
variety of ways, such as dope as percent grade (Figure 8.2a; note that the U.S. 
Geological Survey provides a template that performs this calculation). The form or 
roughness of local terrain has also been mvestigated (Hurlbett 1977:25-26; Plog 
198 1:49), presumably because rough local terrain would mbhuibit day-to-day activities 
and travel to and trom sites (Encson and Goldstem 1980). One measure of local 
terrain roughness 1s termed /ocal relief (Hammond 1964); it 1s measured as the range 
in elevation within a predefined radius of a location under investigation (Figure 
8.2b). High values suggest rugged terrain while low values suggest gentle terra. A 
tery aim texture measure, borrowed from mmage processing (Mork 1980:233), provides 
another alternative. An clevation 1s estumated at the locus of mterest and at a fixed 
pattern of pomts surrounding the locus (Figure 8.2c). The vanance of these 
clevations us then computed. High values suggest variable and dissected terram, 
while low values indicate a level, smooth surtace. 
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Water resources are widely viewed as important factors in locational studies. 
Roper (1979a) states, “‘some resources, such as water, are so basic and so vital that 
the distance to obtain them must be minimized.” In a cross-cultural study of criteria 
influen: ng hunter-gatherer site-placement decisions, Jochim (1976:55) designates 
proximity to water sources as a central factor in determining immediate site 
placement. Mos? «iten examined in settlement studies are distances to a variety of 
water source types, such as permanent rivers, seasonal streams, lakes, springs, or 
streams of specified rank (¢.g., Brown 1979; Judge 1973:120; Levis 1976; Parker 1985; 
Roper 1979b:81; Scsoltz 1981). Linear distances are easy to measure; /east-effort travel 
distances are somewhat harder to estimate (Ericson and Goldstein 1980). Archaeolo- 
gists using categorical variables generally assign class boundaries to drainage basins 
and let the highest stream rank in each basin represent the class category (¢.g., Plog 
and Hill 1971:23; see Unwin 1981:79-84 for a discussion of systems of stream 
ranking | 


The importance of view to hunter-gatherers for surveillance of the surround- 
ing terrain is a widespread notion, and the necessity of a good field of view for 
spotting game animals is often cited. Jochim (1976:51, 55) suggests that a good view 
is one of the chief noneconomic objectives in the selection of immediate site 
locations among hunter-gatherers. In a more complex social context, among the 
pastoral Maasai (Western and Dunne 1979) view is mentioned as an important 
settlement location criterion purely for aesthetic reasons. A good view might also be 
of importance for social or defensive reasons. 





A measure of riew quality was introduced by Brown (1979:197) in a study of 
settlement patterns in western Kansas. This measure, which yields an angle “of 
surrounding terrain visible from a site” (Figure 8.3a), has been used in anumber of 
archacological studies (¢.g., Kvamme 1983b, 1983c, 1984; Larralde and Chandler 
1981; Reed and Chandler 1984). A more common measure pertaining to the view 
concept 1s a linear distance to an overview or vantage point (e.g., Brown 1979:197; Judge 
1973:133; Larralde and Chandler 1981:118), where vantages are defined as high 
points, such as hilltops, ridge crests, or mesa and canyon rims. If view was important 
to the prehistoric occupants, then sites might be locaced on or in proximity to 
vantages. The importance of view, of course, might vary with cultural type, site 
function, the kind of animal being hunted, and from region to region and season to 
season. 


Shelter and the quality of the shelter provided by a] scation 1s often recognized 
as being important in site location studies. Locations offering protection from wind, 
adverse weather, or even sunshine (in desert regions) might have been sought after 
for site placement. Euler and Chandler (1978), for exan ple, examined the shelter 
quality of settlements in the Grand Canyon im Arizona. Among hunting-gathering 
groups, Jochim (1976:51) designates shelter as a central factor in the choice of 
location. 


Shelter is a difficult concept to make operational; “uler and Chandler (1978) 
examined situational categories of shelter in the Grand Canyon, and Larralde and 
Chandler (1981) used an ordinal rank of 11 sheltering categories (from low or no 
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Figure 8.3. Measurement of variables. (A) View angle, a measure of view quality. The hill (A) has the widest horizontal view, the ndge flank (B) has the 


narrower view, and the dramage (C) has the narrowest view. (B) Cylinder volumes, mversely proportional to the sheltering effects offered by a location. The 
hilltop (top) offers poor shelter and has a large volume; the level plam (middie) offers termediate shelter and has an average volume; and the valley bottom 1s 
more sheltered and has a small volume 
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shelter to extremely high shelter) for site location investigations in Utah. Ina recent 
paper (Kvamme 1984) I have attempted to devise an interval-lervel measure of shelter by 
considering how exposed a location is in terms of the shape of surrounding terrain. 
The measure is derived by imposing an imaginary cylinder over the location of 
interest. The top of this cylin ‘er is a constant height (x) above the locus, and its 
sides are aconstant distance (y) from the locus. The volume of air above the ground 
surface encompassed by this cylinder constitutes the measure of shelter. A large 
volume (ce.g., surrounding a hilltop location) suggests an exposed location with a 
low level of shelter, and a small volume (e.g., surrounding a valley bottom location) 
suggests a relatively sheltered location (Figure 8.3b). The ground surface is roughly 
approximated by nine elevations measured at a locus of interest (0) and at surround- 
ing loci every 45° at a fixed radius (y). The area of the base can be approximated 
(base = my’) or calculated exactly (base = | /8}y?). The volume within the imaginary 
cylinder above the ground surface is calculated (after simplification) as follows: 


volume = (base /12)(12x + 8{E0] - El - E2 - E3 - E4 - ES - B6 - E7 - E8) 


where E0, E!, etc., are the nine elevations. This index might be referred to more 
appropriately as an index that reflects hill-like vs valleylike characteristics (see 
Kvamme and Jochim 1988). 





The exposure or aspect of a site is often examined in site location studies in 
connection with sheltering effects. A south-facing aspect, for example, tends to 
offer greater warmth from the sun (during much of the year in most of the northern 
hemisphere). Grady (1980:170) argues that sites may be located with primary 
exposures away from prevailing wind or storm approaches. 


Aspect is usually measured by drawing a line perpendicular to the elevation 
contours of sloping terrain and recording the azimuth of this line, which provides a 
measurement that ranges from | to 360° (Figure 8.2a). A difficulty that this scale 
poses ss that 1° and 359° both indicate approximate north, yet in a quantitative 
analysis 3591s much greater than |. This difficulty can be resolved by collapsing the 
west half of the compass scale over the east half, such that every azimuth on the 
west halfis given the azimuth of its mirror image on the east half. This transforma- 
tion allows the measurement of direction relative to north or south where 0° is 
north, 180° is south, and 180° is twice as far south as 90° (east or west). Another 
approach 1s simply to use the cosine of the angle of prominent direction (Hartung 
and Lloyd 1969). 


Resources (other than water) and their importance to site placement are often 
examined in site location studies. The resources usually investigated are biotic 
communities. A major approach is to divide a study region into environmental 
categories, such as plant communities, and to examine the number or density of sites 
in each community (e.g., Bettinger 1977; Thompson 1978). Catchment analyses 
utilize a vanety of different perspectives. The percentage of various resource 
communities found within a fixed distance of a site might be examined (Findlow and 
Ericson 1980), or perhaps the variability of resources or indices of caloric potential of 
the area within that catchment might be calculated. Simple distance measures to 
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various resources are often utilized. Lipe and Matson (1971:134) mention that sites 
might “be located so as to maximize access to several resource zones”; Gumerman 
and Johnson (1971) investigate the biological trarsition zones between major com- 
munities, or ecofones, arguing that these zones “are also cultural transition zones.” 
Simple distance measures to these resource zones might be utilized, such as a 
distance to the nearest ecotone or to a specific plant community (¢.g., Bradley et al. 
1984:75). Carr (1985:123) discusses other distance measures. When using biotic 
variables, the researcher should keep in mind that present-day vegetation may not 
necessarily correspond to past situations owing to changes in climate or land-use 
practices. 

Finally, it should be recognized that other resources, such as fuel (Jochim 
1976:51), might be important considerations in site location research. In the same 
vein, such resources as lithic raw materials (Johnson 1977:484) might exert a “pull” 
on settlement location, and a corresponding variable, such as distance from a lithic 
quarry, might be used in archacological locational studies. 


Social Factors 


The variety of social variables utilized in archacological locational studies 1s 
certainly smaller than the range of environmental factors that have been imvesti- 
gated. General concepts that have been examined relate to local ate denuties, ate 
proximities, and spacing. Plog (1971:47-48) mentions the importance of density —the 
distance to other sites or sites of specific type —as well as distance to great kivas and 
other ceremonial sites in a southwestern archacology context. The Southwestern 
Anthropological Research Group (SARG) computer system incorporates such social 
locational variables as number of sites within | km and number of habitation sites 
within | km of the site being recorded (Plog 1981:54). Horizontal distance to first- 
through fifth-nearest contemporary habitation sites was investigated by Adams 
(1974) in a locational analysis of Pueblo sites in southern Colorado. 


Gravity models are often used in environmental analyses because settlement 
locations “appear to be related to movement-minimizing behavior” (Johnson 
1977:489), which helps to justify arguments about locational proximity to critical 
resources (¢.g., Jochim 1976). The same perspective can be applied to cultural 
features. Thus, distance to the nearest road or road intersection might be a useful 
variable if prehistoric road networks were culturally important, if they can be traced 
across a region, and if contemporaneity of sites and roads can be established. An 
implicit basis of central-place theory 1s that central places can be viewed as resource 
centers. Hodder and Orton (1976: 108) illustrate empirical data that show decreasing 
site frequency with distance from a resource center. 


Spacing between settlements 1s also a concern. Hill (1971:56) mentions “spac- 
ing due to competition with other groups for critical resources,”’ which might fit in 
with certain territoriality concepts (Bettonger 1980:225; Wilmsen 1973). A mayor 
concept in many settlement studies is regular spacing characterized by hexagonal 
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arrangements of settlement around major centers or central places (¢.g., Johnson 
1972; Flannery 1972). Wobst (1976) discusses hunter-gatherer spacing requirements 


from the standpoint of demographic constraints on biological reproduction. 


ASSESSING PATTERNS IN ARCHAEOLOGICAL 
LOCATIONAL DATA 


Approaches to the study of archaeological site location are, of course, myriad 
(see Kohler and Parker 1986 for an extensive overview). Quantitative data analysis 
approaches might initially be lumped into two categories: those based on trends in 
location and those based on trends in characteristics of locations. Models of the 
locational trends of site distributions are based solely on spatial coordinates; 
locations in space are modeled, not characteristics of locations. As Parker ( 1985:202) 
notes, 


Even on the case of accurate representation of a distribution . . ., this methodology gives 
no mmformation tor explammng why the distribution 1s m a particular form. Exphcation of 
site settlement systems ms enhanced by methodologies which relate site presence to 
location characteristics, thereby allowing mterpretations as to why sites are located 
where they are 


Models of trends in locational characteristics, on the other hand, analyze 
empirical relationships among characteristics of the natural or social environment 
and the locations of sites. Modeling of locational charactenstics has been the 
dominant approach, and such models are to be preferred not only because of their 
generally greater power (see below), but also because they offer some potential for 
interpretation. 


Approaches Based on Trend in Location Only 


Approaches that focus on trend in location attempt to model regional site 
distributions only on the basis of locational (x,y) coordinates. No other information 
is used. Positions in space are modeled, not characteristics of the spatial positions. 
Hence, these models are generally rather crude. 


Trend-surface analysis (Unwin 1975), a regression technique, 1s one procedure 
for modeling locational trends, although it 1s not ideal for site location data. Based 
on spatial coordinates of known sites, most archaeological applications develop 
functions to model a continuous dependent variable, such as trends in dated sites, 
across a region (Bove 1981; Monroe et al. 1980; Roper 1976). Other examples include 
modeling trends in length width indices of Bagterp spearheads across northern 
Europe or varying percentages of Oxford pottery across southern Britain (Hodder 
and Orton 1976:164-174). 
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Note that all of these studies utilize a continuous dependent variable, which 
poses something of a problem for site location analysts because often their goal 1s to 
develop models for discrete classes of such information as site (or site-type) 
presence or absence. This amounts to a nominal-level dependent vanable, tor which 
most regression techniques are poorly suited. One analytical alternative for site 
location modeling in the traditional regression context 1s to convert the presence 
absence critenon to some numeric form that the technique ts better able to handle. 
This might be accomplished by placing an arbitrary grid over the region and 
estimating site density or performing a simple site count in each grid cell to provide 
a dependent variable that 1s more than dichotomous. This approach has been used 
in a number of archacological studies to develop regression models of artifact counts 
per grid unit for intrasite distributional analyses (Feder 1979; Hictala and Larson 
1979; Larson 1975). For site location studies, a similar approach could be apphed on a 
larger scale by gridding a region and treating sites as the unit of analysis. 


A major problem with the trend-surface regression approach 1s tha’ different 
results can be obtained depending on which arbitrary grid size 1s chosen. A ccond 
problem involves the deficiencies of the regression model when it 1s applied to a 
dependent variable consisting of counts. Hodder and Orton (1976) and Davis (1973) 
discuss general problems in the use of trend-surface analysis. 


Kriging approaches to the same problem (Parker 1985:202-205; Zubrow and 
Harbaugh 1978; Chapter 2, this volume) utilize similar kinds of data, spatial 
coordinates and site counts per gridded unit area, and generally do a better job of 
modeling densiies across a region than trend-surface approaches (Delfiner and 
Delhomme 1975). This method also suffers from problems resulting from arbitrary 
grid sizes, however. 


Recently, an approach to trend mapping that is specifically designed for 
nominal-level class categories has been developed (Wrigley 1977a, 1977b). This 
method is based on a logistic regression technique (see below ) and can be referred to 
as logistt trend-surface analysts. 1t makes no assumptions about distributional form, 
and it 1s appropriate for a nominal-level dependent variable. Moreover, the 
dependent variable can consist of multiple class categories (e.g, site absent, site 
type A present, site type B present, site type C present). For a given locality, with 
spatial coordinates x and y, the outcome is a value for a class that is constrained 
between 0 and |. This value can appropriately be interpreted as the probability of 
an outcome, such as site presence, given its location coordinates (Wrigley 1977b:12). 
Examples of this technique all come from geography and include the probability of 
households in a neighborhood shopping at a particular market vs the probability of 
the households n. + shopping at that market (a two-class problem; Wrigley 1977b) 
and probability trend surfaces of households highly annoyed, moderately annoyed, 
and little annoyed by aircraft noise in the vicinity of Manchester Airport (a 
three-class problem; Wrigley 1977a). 


A model of archaeological site trend in location can be developed through 
application of the logistic trend-surface technique. The locations of 95 known 
open-air lithic scatters in a 5.5 by 8.5 km test study region on the southern Colorado 
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plains are presented im Figure 8.4a (this study regson will be extensively used for 
examples in later sections of this chapter). The study region has been gndded mto 
approximately 19,000 cells (land parcels) measuring SO m on a ade; Figure 84a 
illustrates those cells with open-air sites present. The results of vanous efforts to 
develop a probability trend surtace for the presence of this site type based only on 
the spatial coordinates of the known sites are shown om Figure §.4b-d. This os a 
simple two-class problem of site presence and site absence, although we are 
interested only in the mapping for the site-present class. (Note that m a two-class 
problem the mapping of one class is the “negative umage™ of the other class since 
probabilitees at any locus must sum to unity. Thus, it 1 not necessary to produce 
probability surface maps for both classes. !n 2 problem context mvolving three or 
more classes, however, a separate probability surtace map for cach class 1s required, 
each derived from a separate equation that 1s mathematically calybbrated to the other 
class equations.) The site-absent locations were obtained at 54 locations (cells) 
systematically placed every kilometer across the study area. 


First- through fourth-order logistic trend surtaces were fitted to these data 
using the BMDP logistic regression program (Dixon et al. 1983). Fitting trend 
surfaces to empirical data requires use of polynomial functions, which employ 
various powers of a variable. A function of x and x? (a second-order model) makes a 
graph with one “bend”; a function of ©, x’, and x! (a third-order model) makes a 
graph witt two bends, and so on. Since we are working im a two-demensonal space 
with (x,y) coordinates, we need to express powers of both variables (x, 27,2", .. 9,9, 
y', .. .) plus all interactions between the two vanables (xy, 1%, xp’, x4, . |.) 
Generally, the higher the order of the mode! the better the fit co the data. Because 
the resultant functions are only combinations of these rather meaningless variables 
and their powers, it becomes clear what Parker (1985:202) was alluding to m the 
quotation given above, when she claimed that these models have litle explanatory 
potential. 


The first-order probability surface contains the terms « and y. The second- 
order model adds the terms x’, ry, and y?; the third-order model adds to these the 
terms x', xy, and y’; the fourth-order model adds the terms x*, xy, xy", xy’, and y* 
(see Feder 1979:96). Thus, the fourth-order model contams a total of 15 parameters 
(including an intercept) that must be estimated. Second- through fourth-order 
surfaces are portrayed in Figure 8.4, with site-presence probabilities portrayed m 
steps of 0.2 probability and in levels of increasing darkness. 


In traditional trend-surface analysis (discussed above) the utility of the vanous 
polynomial surfaces are usually evaluated on the basis of increases im R? (vanation 
accounted for in the dependent variable) over previous surtaces (Unwin 1975). This 
is not possible with the logistic trend-surtace techmque since the dependent 
variable is categorical. A number of pseudo-R? statistics for logistic regression have 
been introduced. One, Ry? (Baxter and Cragg 1970), provides a value that ranges 
between 0 and |, sichovah midrange values are considered very good for wndices of 
this kind (Stopher and Meyburg 1979:334). The first- through fourth-order surfaces 
shown in Figure 8.4 yield the following values of Ry*: 0.0218, 0.3125, 0.3799, and 
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Figure 6.4. Models based only on locational of powtional information (1,) coordmates): (4) locations of known sites, (B) second-order logustac trend 


surtace, (C) therd-order logustsc trend surtace, (D) fourth-order logit trend surface 
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0.5043, respectively. Thus, the first-order surface accounts for almost none of the 
“vanation™ im site presence absence. The second-order suriace provides a substan- 
tual umprovement (increasing Ry? by about 0.29) because the resalting probabdity 
cllpses center around the major concentration of site locations (Figure 8.4b). The 
third-order surface provides an increase in Ry’ of about 0.07, and the fourth-order 
suttace yields another leap, an increase of 0.12. Note that the fourth-order 
suttace (Figure 8.4d) does a relatively good job of modeling or describing the spatial 
distribution of the known sites (several branches and clusters of sites are packed up 
by the surtace), considering that st represents a ample function based solely on the 
spatial coordinates of the site-present and site-absent data. it should be apparent 
that of the locations of the known sites in a modeled region are representative of the 
locations of unknown sites im as yet unsurveved areas of that regson, then high- 
order logistic trend surtaces offer a predictive aspect, like any other model. 


We might also apply the gain statistic, discussed above, mm order to examine 
model performance in a more interpretable way. The gain statistic was defined as 
one minus the rateo of the percentage of the total area encompassed by a model 
when mapped, divided by the percentage of total sites withen a model's area; a good 
model 1s suggested as values approach | (small area with a high percentage of sites). 
The locations mm Figure 8.4 with an estemated probability of membership m the 
site-present class greater than 0.5 (the two and one-half darkest levels of shading) 
can be used as the area encompassed by cach model. The 0.5 pomt, which us a 
traditional decision rule, us arbitranly used here and elsewhere tor comparative 
purposes only; later sections examine other decison rules. When the 0.5 level os 
used, the second-order model (Figure 8.40) covers approxumately 40 percent of the 
study region and 74 of the 95 sites (78 percent) occur within that area. Thes yeelds a 
gain ‘atustic value of | - 40 78 of | - 0.513 = 0.487. A sumilar assessment of the 
third-order model (Figure 8.4c) reveals that the modeled area 1s 99 percent of the 
total area and that 80 percent of the sites (76 of 95) he withen that area. Thus, the 
third-order model provides only a shght umprovement m gam (gam = 0.513). The 
fourth-order model (Figure 8.4d) provides a major improvement, encompassing 
only 31 percent of the total area and including 82 percent of the sites (78 of 95), 
vielding a gain statistic of 0.622. 


Approaches Based on Trends in Locational Characteristics 


Archacologists have examined trends m archacological site locational charac- 
teristics, particularly environmental teatures, for a long time. To dlustrate, m a 
study of the Paleoundian occupation of central New Mexico Judge (1973) examined 
water sources, vantage pounts (from which game might be viewed), hunteng areas, 
and trapping areas (locations where large anumals could be driven and trapped), and 
thei relationshups with the sites on his sample. In thew mvestigation of prehistoric 
Shoshonean settlement patterns mn Nevada, Thomas and Bettungert (1976) examined 
distance to water, distance to the pinon ecotone (pinon was consdered an umpor- 
tant economic resource), clevation above the valley floor, and ground slope. The 
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mmportance of shelter, fuel (firewood), a good view (to observe game), and water to 
the ummediate locations of hunter-gatherer sates em general were outhned by Jochem 
(1976-55) mm a study based on ethnographic lterature. Analyses pertaming to more 


complex agncultural stuations often examine condstions related to the arabilty of 


the land. For this reason, Green ( 1973) mnvestagaced tive vanables related to soil type 
m an analysis of Maya settlement m Belize. A soi texture varuable as well as 
vegetation, hydrographac, and landiorm vanables were examined by Roper (1979) 
ma study of Woodland site locations wm central Hlmow. In all of these studies, 
characteristics of site locations are the focus of mmterest, and as noted carher, vanous 
social factors, such as distance to the nearest comtemporary road or to a ste offering 
services of rehgsous of social resources, may also be conudered characteristic of a 
location. 


Approaches of the kind yust descnbed typically summarize empirx i data 
observed ot measured at known site locations through tables or various descriptive 
statistics. The ability to “predict™ mm general terms on the bass of these data 
patterns os umphcitly or exphewtly recognized. Many archacological studies of this 
type have depended largely or wholly on the use of nominal-level categories tor 
mvestigating site location patterning. One such predictive model developed by 
Settunge: (1977:220) was constructed for “predsctong the distribution, function, and 
density of archacological materials om the Inve-Mono region.” This model smply 
divided the study region ito biotx communstees and proyected expected numbers 
of various site types om cach community based on site density estimates obtamned 
from sample surveys. This 1s the most common approach m traditional site-location 
imvestigations, and discusson of other examples (c.g., Brose 1976; Grady 1980; 
Reher 1977; Thompson 1978) would be redundant. 


Other investigators have focused on comtrmuous site-location mformation 
(e.g.. Judge 1973; Findlow 1980; Hurlbett 1977). Such empirical data can be quite 
useful im formulating proyections about site locatroms. One might show, tor example, 
that « percent of sites occur within » distance of a dramage m a study region by 


obtaummng measurements of distance to water from a representative sample of 


known sites in the region. Thomas and Bettenger ( 1976:362- 363) go a sep beyond 
thes by fitting normal distributions to empurcal data on slope, distance to water, 
distance to prhon ecotone, and elevation above valley floor obtamed at site locations 
m the Reese River Valle: of central Nevada. The central portions of these normal 
cutves are taken to represent “ideal locations” tor sites. Moving m enther direction 
along the curves (¢.g., to steeper ground) decreases the probabulity that sites will be 
tound 


The practice of fitting theoretical dutrbutsons to data ms a common one m 


many disciplines (¢.g., Cooper and Weekes 1983.20). The above procedure of 


Thomas and Bettinger might seem usetul for modelng ste distributions, c.g., 
proyecting site probabulitees across the landscape. Such models are called wnglr-/av 
clastfiers im remote-sensing apphcations (Lin and Munter 1976, Munter 1975) because 
they are used to desernbe the distribution of a specified class (e.g. a site-present 
class) wing data only from that class. (Such sungle-class approaches do not pertorm 
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as well as approaches that utilize a second class as a control group to contrast with 
the group of interest: this latter approach will be described below.) 


A problem with archacological studies of the type discussed above is that they 
often consider nominal- or interval-level variables only singly, on a univariate level. 
Data often are not examined in a multivariate context, and as a result interrelation- 
ships and redundancies between variables are seldom considered. Nor are their joint 
effects on site location taken into account for prediction purposes, even though a 
cursory inspection of the literature points to the multivariate nature of the site 


location problem. 


Control Groups 


An wmportant methodological difficulty of many archacological site location 
studies 1s the failure to use a control group with which archacological distributional 
patterning av be compared. We might imagine, for example, a newspaper report 
indicating that “90 percent of the inmates of Smith County jail are nonwhite 
minorities.”’ Such statistics are often used in lay contexts, but a scientist secks 
background controi data before formulating conciusions. Ifa control group obtained 
by selecting a random sample ot members of the entire population of Sm’ *h County 
were to imdicate that 90 percent of the population are nonwhite mmorities, this 
would suggest that the jail inmate proportions do not represent a noteworthy 
pattern. This example has d.roct bearing on archacological site location studies 
because the same kind of initial argument 1s offered in many studies, namely that x 
amount of sites are located within y distance of a resource. 


In many disciplines, control data sets are routinely used. Quantitative psy- 
chologists, for example, typically measure personality traits on a control group 
selected randomly trom the population. This reference body of data is then 
compared with data trom the group under study, ¢.g., suicide-prone individuals 
(Overall and Kh +: 1972:257). Geologists have compared locations exhibiting high 
levels of radioactive emissions with a control group of locations exhibiting low 
emission levels in order to develop predictive models for uranium exploration 
(Missalat: et al. 1979). Remote-sensing scientists obtain spectral data from a variety 
of environmental settings in order to amass a comparative background with which 
spectral emissions of crop types of interest, such as wheat, can be compared 
(Landgrebe 1978; Swain 1978). These techmiques are common m_ pattern- 
recognition studies (Duda and Hart 1973). 


In archacology a similar approach can be taken. Environmental or other 
information can be measured at locations (land parcels) containing known archaco- 
logical sites and then contrasted with a control group of identical measurements 
obtained at random locations in a study region where sites are known to be absent. 
By this means, environmental and other features bearing relationships with the 
locations of sites might be identified. Data for such a variable as distance to nearest 
drainage, for example, might be collected at a representative sample of archacolog)- 
cal locations within a region (Figure 8.5a, top). Since the distribution of the data is 
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concentrated in the area of the graph representing short distances to water, a 
typical archacological conclusion might be chat proximity to water 1s an important 
factor in site location. Yet we must also ask how far any location within the region 
under study 1s from a water source before drawing such a conclusion. A control 
group of measurements of distance to water taken at random locations where sites 
do noi occur might yield an identical distribution (Figure 8.5a, middle), forcing the 
conclusion that water 1s generally close to any location and that proximity to water 
is not a significant factor in site location im this area. If, on the other hand, the 
control data yielded a distribution with a central tendency some distance from 
water (8.5a, bottom), the archaculog'st might more justifiably arrive at the conclu- 
sion that proximity to water is a sigaificant locational factor (see Kvamme 1[985a). 


As the above example sugges’s, a control group approach may be essential to 
forming valid conclusions concerring site location factors when empirical archaeo- 
logical data are used. Control groups serve several important functions. Their use in 
model development is discussed below, but pc rhaps the most important use of 
control group data 1s in model testing; it 1s only through the use of a control group 
that the performance of a site location model may be properly assessed. Returning 
to the example given in the intoduction to this chapter, a model might classify every 
location (land parcel) within a region as site-likely and thereby predict all actual site 
locations with 100 percent accuracy, but such a mode! is useless. (In this case the 
gain statistic would yield | - [100 percent of total area classified by model] {100 
percent of sites in model arva] = 0.) On the other hand, by using a control group that 
approximately represents the environment at large, it might be found that a site 
location model encompasses only 60 percent of the land area of the region when 
mapped, but it include; 90 percent of all sites within that area, representing an 
amount of gain against which the utility of the model may be judged (in this case, 
gain = | - 60 90 = 0.33; see the section below on “Assessing Model Performance” and 
the discussion of gross errors and wasteful errors in Chapter 3). 


The use of a nonsite control group also helps to clean up some conceptual 
sloppiness. Through use of certain statistical procedures, we often wish to estimate 
the probability of sive-group membership at a location (land parcel). Obviously, this 
probability often can be less than 1.00. But if the probability of site-class member- 
ship 1s estimated as 0.6, what does the remaining 0.4 probability represent? Logi- 
cally, and consistently, this remaining probability represents site absence, the 
complement of site presence. Thus, models for site presence must also consider site 
absence, and nunsite data permit us to do this. 


Nonsite locations should be selected from throughout the region in which the 
sites under question are being modeled. If the region is large and diverse, with 
multiple natural subgroupings of the environment (¢.g., plains and mountains), 
then the investigator might wish to examnic site location patterning within each 
grouping (a plains model and a mountains model). Such a practice could lead to 
enhanced precision of predictions. In this case, it 1s appropriate to randomly select 
nensites according to the groupings (strata). 
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Figure 8.5. Uses of control data im empirical studies. (A) Distances to water om a hy pothetical study 
region. The top histogram indicates the empirical dist nbution measured at a representative sample of site 
locations. If distances to water were measured at rancom locations where sites were known to be absent 
and the middle distribution resulted, it could be concluded that proximity to water is not a sygnificant 
site-location factor. If the bottom distribution resulted, the IMpPort arr c ot proximity to water w ould be 
mdi ated (atter Kv amme 1985b). (6) A two-dimensional measurement space where X; might be ground 
slope and X2 might be distance to water. The decision boundaries attempt to separate the site (hollow 
circle ) and nonsite (dot) classes: (1) lewel-shee, (2 quadratic, and (3) linear decision bor adanies (discussed 
m text 
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The use of background data sets has been explored to some extent in archaco- 
logical site location studies. Plog and Hill (1971), Plog (1971), and Flannery 
(1976-92-93) point to the importance of knowing conditions in the environment as a 
whole before assigning significance to a particular factor m terms of site location, 
and Plog (1968) and Zarky (1976) actually determine background characterstics m 
their studies of prehistoric settlement systems. These studies focus on proportions 
of gross environmental categories (e.g, arable land, mesas, nver bottoms) im the 
study region as a whole as a basis for contrast with the observed pattern of site 
distribution with respect to the same categones; differences in proportion are 
imterpreted as umplymg some sort of selection on the part of the prehistoric 
inhabitants for some of the environmental categones. In contrast to this focus on 
large-area environmental types, which offers httle information on the immediate 
locations of archacological sites, a techmaque in which control data are measured at 
random “pont” locations (¢.g., land parcels or quadrats of very small size) at which 
sites are known to be absent can provide suitable background contrasts to identical 
information recorded at known site locations (Custer ct al. 1983, 1986; Kvamme 
1980, 19K3b, 1984, 198Sa, 1986; Larralde and Chandler 1981; Moraim et al. 1981; Peebles 
1981; Wells et al. 1981). A somewhat different approach, but one that uses an 
identical methodology, measures control data for large land parcels (¢.g., one-hallt or 
one square mile) that contam no archacological sites (Holmer 1979; Scholtz 1981; 
Schroed] 1984; Zier and Peebles 1982). 


Patterns and Clasafwation: The Measurement Space 


Screntists working im remote sensing, pattern recognition, statistics, and 
decision theory have developed a number of ways to classity obyects (individuals, 
locations) into prespecitied groups. A great deal of practical expenence im predictive 
modeling in geographic contexts has been gained by researchers attempting to 
analyze and classify remotely sensed mages. 


In mage analysis studves, multispectral scanners (M488) on plattorms mm orbut 
above the earth sense reflected radiation from the earth's surtace (see Chapter 9, 
this volume). The predictor variables are the various MSS bands mm which reflected 
radiation 1s measured. The basse unit of analyses os termed the pix (picture 
element), which corresponds to a small area on the ground. Reflected radiation 
values are measured on each MSS band (the variables) tor each pixel mn the regron of 
mterest; mage classification screntists then use the measured reflectance character- 
istics to classety each pixel into hkely (prespecified) groups of mterest, such as wheat 
vs nonwheat, forest vs nontorest, of urban vs nonurban areas (Landgrebe 1978). 


The analogy with our archacological problem us clear: mm many site-location 
modeling approaches we want to classify locations (analogous to pixels and often 
small in area) into site-hkely, site-type-hkely, or sste-unhkely categones on the 
basis of the variables (usually measuring terran of environmental characteristics) 
measured at the locatyons. Modeling approaches that utilize computer-based geo- 
graphic information system (GIS) techniques (Hasenstab 1983; Kvamme 1983b, 
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1986; Chapter 10, this volume) actually gnd entire study regions into small cells 
(pixels) and treat these cells as the units of analysis. As a result of this general 
similarity between the problems of remote-sensing classification and those of 
site-location modeling, many of the techniques presented in this chapter are 
borrowed directly from pattern-recognition and image-classification studies. 


In pattern-recognition and image-analysis research, measurements obtained 
at locations belonging to known categores are often called framing data because 
they are used to develop or “train” classification functions. These functions are 
numerical decision rules that utilize class characteristics (1.¢., measurements) to 
classify entities whose group membership is not known (Swain 1978:/42). In an 
archaeological context, samples of known archacological site locations constitute a 
training set, and measurements of environmental and other variables at each of the 
sites provide a site class characterization. If a control group of site-absent locations 1s 
used, measurements at these locations provide a nonsite class characterization. 
Patterning represented by the measurements of each class can be used to assign 
future locations (for which site presence absence is unknown) to one of the classes 
in a predictive sense. Exactly how this is accomplished depends on the nature of the 
technique used (several alternative methods are presented later), but all techniques 
for accomplishing this goal have an underlying similarity. 


Archaeological locational data typically occur as a series of points or small areas 
on maps that represent the locations of known archacological sites or artifact 
clusters. These site locations might suggest an intuitively identifiable settlement 
pattern; for example, the sites might be located along high terrace ndges above 
major drainages within stands of oak. In working with classification procedures that 
can be used to model a site location pattern in an objective manner, however, a more 
abstract concept of the term pattern is required. Characteristics of a location are 
reduced to a series of measurements (which may be categorical), and the classifica- 
tion procedure compares the measurements with a set of previously made mea- 
surements that are “typical” of known classes, such as site-present and site absent 
categories. The location is assigned to the group whose measurements are most 
similar to its own. In other words, as tar as a classification procedure is concerned, 
after the measurements are obtained the physical form of the location and of the 
surrounding environment are unimportant: the set of measurements # the envi- 
ronmental (or other) pattern of the location (see Swain 1978:139). In general, we 
might think of the environmental terrain characteristics of a location simply as a set 
of measurements, not in terms of their physical form. 


The # measurements made at a location define a point in #-dimensional space, 
which is referred to as the measurement space. The purpose of a classification procedure 
18 to divide the measurement space into appropriate decision region, cach correspond- 
ing to a specific discriminable class, and to assign the measurements made at a 
location to the class that corresponds to the decision region in which it falls. A 
two-dimensional measurement space where X; might be ground slope and X2 
might be distance to nearest water us ilustrated in Figure 8.5b (above). The 
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site-present locations (hollow circles) tend to possess level ground and are close to 
water, while the site-absent locations (the black dots) overall tend to be on 
somewhat steeper ground and farther from water. The decision boundaries (several 
are presented for later reference) attempt to separate the two classes. If X1 and X2 
were measured on a map at some location where site presence absence 1s unknown, 
the location would be identified as more similar to the site or the nonsite group, 
depending on where its measurements fall relative to the currently defined decision 
boundary. Of course, some nonsite locations will always fall on the site side of the 
decision boundary and some sites may fall on the nonsite side, which introduces an 
amount of error that we attempt to minimize. (The case exemplified in Figure 8.5b 
is an oversimplification since, in practice, we work with many more variables 
| dimensions}, which provide more information and help to reduce error.) 


Although the above example utilizes continuous data categorical data can be 
approached in the same way. When dealing with such vanables the measurement 
space is best seen as a table, with one dimension representing class partitionings 
according to one variable (¢.g., plant communities) and the other dimension 
comprising class partitionings according to another vanable (¢.g., topographic 
categories). 


Practwal Statutwal Concerns in Model Development 


In earher chapters a great deal has been said about statistical inferential 
techniques and their proper application in site location studies, particularly with 
regard to meeting various statistical assumptions. It is often difficult, however, to 
meet many of these assumptions in real-world applications. This section briefly 
discusses certain statistical difficulties pertaining to sampling and model 


development. 


A concern commonly voiced in regional archaeological studies pertains to 
apprehensions about cluster sampling and the problem of obtaining representative 
or unbiased samples of sites from within a region (¢.g., Berry 1984; Mueller 1974; 
Thomas 1975; Chapter 5, this volume). What 1s meant by representativeness is that 
the characteristics of a sample (1.c., sample statistics) are unbiased estimates of the 
true parameters of the population under study (¢.g., the mean slope value esti- 
mated from a sample of sites provides a good estimate of the true mean slope value 
for the population of all sites in the region of study). 


Some archaeologists maintain that the only way to obtain unbiased site 
samples in a regional context is through simple random sampling, but to obtain a 
correctly drawn simple random sample of sites from within a region, the location of 
every site would have to be known beforehand, each would be assigned a number, 
and a simple random sample would be obtained by random selection of the 
numbered sites (see Thomas 1975:78). Clearly this approach 1s impractical; 
moreover, if every site were known prior to the sample selection, there would be no 


need for the site-location modeling exercise. 
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An alternative procedure that would allow sumple random sampling requires 
that the researcher superimpose a small-mesh grid ( where cach grid cell us approx- 
mately the size of a typical archacological site) over the region of study. Each gnd 
cell 1s assigned a number, a simple random sample of cells 1s drawn, and this sample 
is then surveyed. Most of the cells will not contain sites, owing to ther relative 
rarity (see “Base Rate Probabilie:,” below). If they are very rare, for example 
occupying only about | percent of the cells, then this procedure also presents 
difficulties since many thousands of cells would have to be surveyed to obtain 
reasonable site sample sizes. Additionally, the problem of traveling to and locating 
numerous randomly placed small cells presents a nontrivial factor that must be 
considered 


Even if we could obtain simple random samples in regional surveys, problems 
would still arise in attempts to draw statistical inferences during model develop- 
ment. Most techniques of statistical inference assume independent observations. 
Statistical independence im terms of areally ustrbuted data umphes that when 
observations are ordered im space it should not be possible to have a better than 
random chance of predicting values of some observations when other values are 
known. As Gould (1970:443) points out, “it 1s doubtful that one could find an 
assumption that is more at variance” with geographical data; spatially distributed 
phenomena generally possess regular spatial variation of positive spatial autocorre- 
lation (Chiff and Ord 1973), thus violating independence assumptions. Tobler 
(1970:234) has referred to this property as “the first law of geography: everything w 
related to everything else, but near things are more related than distant things.” 


Many environmental phenomena commonly examined in archacologncal stud- 
ves, particularly distance measures, exhibit significant levels of spatial autocorrela- 
tion. To illustrate, | undertook a simulation study (Kvamme 1985b) that utilized 
simple random samples (# = 100) of | ha locations from a 100 km? regron. The data 
were obtained from a working geographic information system (see Chapter 10), 
which facilitated the simuleton. At cach location, elevation and slope (percent 
grade), two commonly used vaiiables mm archacological studies, were determined. 
An autocorrelation coefficient (Chil and Ord’s | 1973] 1 statistic) was calculated for 
these variables for each of five smulation runs (where anew sample was selected for 
each run). In an associated significance test that yielded standard z-scores (a 
common statistic used to evaluate significance; see Thomas 1976), the z-scores 
ranged from 3.71 to 10.17 for slope (with an average of 6.15) and from 7.72 to 9.95 for 
elevation (with an average of 8.46). These scores indicate highly significant levels of 
spatial autocorrelation for these rather common variables, pomting to a lack of 
independence between the observations, since a z-score of 1.64 is sagnificant at the 
0.05 level and a z-score of 3.72 1s significant at the 0.001 level (these tests were 
one-tailed). Thus, the real world presents difficultees even for smple random 
samples; researchers who somehow are able to obtain them and argue statistical 
correctness may he working with a false sense of security. 


In regiona’ archacological analyses we often have no chowe but to use some 
form of clhueter sampling to obtain representative samples of sites from within a 
region. Av Holmes (1970:381) states, 
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Hence, 1s umportant to examune the cflects of cluster samphng on quantstative 
classification models before looking at the modeling approaches themselves, sence 
many models of necessity are based on cluster-sampled data. 


The typical sampling practice employed m umage-analysis studies (Mosk 
1980:F ig. 8.7; Schowengerdt 1983:192, Fig. 3-30) us informative. This procedure us 
portrayed m Figure 8.6a, whoch dlustrates forested areas (shaded) and nonforested 
areas (unshaded). The large blocks represent ground-truthed clusters of small cells 
ot pixels of known group membership; in the remainder of the image, group 
membership 1s unknown. In cach of the prxels, measurements of reflected radiation 
are recorded. A goal might be to develop a predictive model that classifies forest 
locations mn the remainder of the mage based upon reflectance characteristics of the 
known forest and nonforest prxels. This form of cluster sampling 1s somewhat more 
extreme than that typical mm archacological sampling, since the analysis elements 
(pixels) occur m tight, contsguous blocks (compare Figure 8.6a with Figure 8.6b, a 
typical archacological sampling example). One might expect that m the Figure 86a 
example there would be a high degree of positive spatial autocorrelation because of 
the increased relative proxumity between analysis elements. A second drawback of 
cluster-sampled data us that wit hin-class vanation tends to be underestumated (this 
follows from the reduced variability withen clusters), making classes appear more 
different than they really are. The possible drawbacks of cluster sampling must be 
weighed agaist its benetits; om the remote-sensing case (as on archacology ), cluster 
sampling 1s much less difficult and costly than obtamimg smple random samples of 
elements. As Schowengerdt (1983: 192) states, 


In all random sampling procedures, atm deserable to select random group of preels rather 
than sungle preels because of the practi al dethoulty om acowrately locatong sengle preets om 


the grownd [emphases orgenal | 


In a classification perspective, the detromental ctiects of well-demgned cluster 
sampling seem to be small, as undicated by excellent classification results typocally 
obtained by remote-sensing studies (Mork 1980; Schowengerdt 1983; Swam and 
Davis 1978). It us casy to see why thes os true: a classification procedure only 
partiteons the measurement space (Figure &.5b). Differences between the mea- 
surement spaces detined by sumple random samples and those defined by sustably 
constructed cluster samples are rather small when compared with diflerences 
between discrumnable classes, particularly when the site-present and site-absent 
contrast discussed carer us used. One semulation study (Campbell 1981) compared 
the performance of the dense cluster sampling practice dlustrated m Figure 86a 
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with simple random samples in classifying forested areas (vs nonforested areas) in 
several diflerent Landsat scenes. The classification accuracy of the predictive 
(discriminant analysis) models obtained from the less autocorrelated sample random 
samples ranged trom 15 percent better to 2 percent worse (an average of 6 percent 
better) than that of the corresponding models obtained from the more highly 
autocorrelated cluster samples. The lesson to be learned from this evadence 1s that 
we should not be too concerned with the detrimental effects resulting from the use 
of cluster samples, considering the benefits that are derived in return. 


Notwithstanding these results, the question about the correctness of using 
statistical inferential procedures in these contexts still remains. Researchers faced 
with similar problems have developed robust model validation procedures in 
remote sensing and elsewhere (Schowengerdt 1983; Swain 1978). As described 
earher, a site location model can be viewed semply as a classification of decision rule. 
For the moment let us forget how such a model might be developed. Rather, let us 
focus only on the sdea that we have a decison rule, however nt was derived, and that 
we can apply it to measurements obtained at locations (land parcels). Based on these 
measurements, the decison rule yields, at the very semplest, an assignment to onc 
of two categories — for example, site present or site absent. If the decison rule has 
some predictive capacity om terms of the populations under study, then at should 
offer correct decissoms more often than could be attributable to chance. This notion 
can be tested in practice by obtaining new random samples of known site-present 
and site-absent locations (both entirely independent of any samples that might 
have been used carher mm model development ), by applying the decisson rule to the 
measurements at these locations, and by determining how well these sites and 
nonsites are classefied. If the percentage correctly classified ws greater than that 
attributable to chance, then the decison rule has some predictive capacsty, and it 1s 
here, in this testing phase, that methods of statistical mnference are more appro- 
priately apphed. Relatively semple statistical testing procedures can be used to 
assess the significance of model classification results (see below ). 


The use of independent test samples makes this overall approach rebw 
because periormance us zesessed on entirely new sets of data, which gives an 
excellent idea of true model periormance m practice and obviates the need for 
rehance on the a.sumptions of multivariate statistical theory (¢.g., multivanate 
normality and homogeneity of variance) m the model-development stage. Note 
that on this scheme st does not matter what procedures are used to develop a mode! 
or decison rule, ame all onference: about the wefulnes of a model are drawn trom the 
independent tet amples. Thus, any procedure can appropriately be used to formulate a 
decison tule, from semple subjective notions about site locatioms to complex 
multivanate data models. Regardless of the model-buslding procedure uved, how- 
ever, the statutical assessment of ts worthiness made through mdependent test 
samples. This is the approach taken om the remainder of the chapter. Sach multivar- 
sate techmques as multeple discrummant analyse and logistic regresson are used um 
subsequent sections tor model development, but only as a means of obtamung a 
partitronmng of the measurement space om the form of a decrmon rule. These 
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algoruhms are based on very powerful mathematical diflerencing techmques that 
are able to provide excellent partitionings of the measurement space even when 
underlying assumptions are not fully met. 


Example Analysis Based on Locational Characteristics 


A site location study pertormed m the Glade Park region of western Colorado 
(Kvamme 1983c) can be used to dlustrate model building based on locational 
characteristics. In this section these data are used to dlustrate one approach to 
model development based on environmental and terram characteristics observed at 
the know:, site and nonsite locations found by that study; a later sects>a will use 
these data to ilustrate model-testing procedures. For amphcity the model 1s 
developed for the locations of all open-air sites within Glade Park, although 
identical methods would apply for specitic site-type model development (see 
“Modeling Individual Site Types,” below). Only environmental factors are consid- 
ered um this analysis. Vanables measuring social tactors would, of available, be 
treated im an identical manner, but m the present study, which dealt with hunter- 
gatherer archacology, contemporaneity of the sites was umpossible to establish from 
the survey data and such features as central-place settlements amply did not exist. 
The analysis was carned out by treateng each hectare (100 by 100 m parcel) as the 
unit of mvestsgation and then comparing land parcels that included sates with land 
parcels that did neu. 


The Glade Park study region, encompassing nearly 650 mr’, les on the western 
border of Colorado m the Bureau of Land Management's Grand Junction Resource 
Area. This arid region of mesa and canyon country « covered by pimon-juniper 
forests mterspersed with grassy clearmngs and 1s archacologically one of the nchest 
ateas of Colorado ovtsde the southwestern portion of the state (Wormungton and 
Laster 1956). The archacological sites uniformly consist of small scatters of chipped 
stome artitacts, lithic debris, and occassonal grownd stone; ceramucs are extremely 
rare. 


Sampling 


The purpose of the archacological survey conducted m Glade Park was to 
obt aun a random sample of site locations to be used mm a modeling study of patterns of 
prehistoric wte distribution. This was accomplished by surveying 38 quarter- 
sections randomly selected trom a total oft nearly 2600. These quarter-sections were 
gtedded mto64 units of | hacach, which were the promary analysers umts. Prehistoric 
wites were discovered mm 157 of these | ha units out of a total of 2442 units examined. 
OM the 2275 land parcels that did not cont am sites, a random sample of 157 was drawn 
to setve as the nonsite control group. It should be noted that, because nonsite 
locatrons were selected from a hemited number of guarter-section clusters, environ- 
mental (and other) vanatron may have been wnaerest'r. ‘ whech can have a 
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deleterious ctiect on the performance of the resulting model. This practice of 
selecting nonsites from the same clusters as those im whach setes are found also tends 
to make nonsite samples more sumilar to site samples than 1s really the case mn 
nature, and this also can weaken a model. Although the Glade Park sample may not 
be optumal, a will be shown that very good results can be obtained through the 


nonsite sampling procedure used here. 


An alternative nonsite selection approach that resolves these problems to some 
extent was used im a Colorado plas ste location study (Kvamme 1984). This 
approach recognizes that m many regyons archacological sites are an extremely rare 
phenomenon, occurring by chance on the order of around | percent of the tume (see 
the section below on “Base Rate Probabilities”). In other words, for every acre ma 
regyon that contains a site there might be 9? acres where no sites occur. The 
alternate approach draws a smple random (or other) sample of control locations 
from across the entire landscape of the region regardless of whether or not the 
locations have been field inspected for archacological resources. The advantages of 
this approact. are (4) that the resulting control group represents the true range of 
background environmental variation and (6) that levels of spatial autocorrelation 
are teduced (since selection 1s not by clusters). The disadvantage us that by chance a 
small percentage —in the above example around | percent —of the control locations 
actually falls on sites, which mcroduces error mm group identification. This error us 
usually neghgible and has Wv.tle effect on analysis. A control group obtamed om this 
manner may still be referrec| to as a “nonsite™ group since under such conditions the 
vast majority of the group (9 percent m this example) really are nonsites 
Obviously, mm areas where the probability of finding a site os high this procedure 
should not be undertaken. 


Earvronmental V aniables 


Fourteen environmental and terram variables were measured at the center of 
cach of the 157 site and 157 nonsite units. The variables were measures pertamung io 
landtorm. water, view, and shelter. {he landform vanables were slope measured as 
percent grade (Figure 8.2a) and iocal rehet within 100, 250, 500, and 7590 m (Figure 
8.2b). The water vanables conssted of honzontal and vertyal distance to nearest 
stream and to nearest permanent river. The view variables were distance to nearest 
port of vantage and a measure of the angle of weew (Figure 8.34). The shelter 
variables conssted of aspect measured relative to north or south (usung the 180° 
rescaling technique described above) and shelter volume measured within 100 m 
and 250 m (Figure 8.3b) but rescaled such that low (negative) values suggest 
relatively little shelter (hills) and high (positiwe) values suggest relatively high 
shelter (valleys) 


l wrartate } xammation 


The sample means, trommed means (removing 15 percent of the largest and 
smallest values), medians, and standard deviations (Chapter 5, thes volume) tor the 
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site and nonsite samples are presented in Table 8.1- It is readily apparent that some 
major differences in environmental patterning exist between the site-present and 
site-absent (nonsite) groups. For exampie, sites tend to occur closer to water, on 
relatively level ground, and in regions of less local relief, and they tend to have 
better views. Sites also tend to occur under limited ranges of environmental 
variation (as indicated by somewhat smaller standard deviations). 


Given such results, most researchers attempt to assess the statistical signifi- 
cance of the data patterning (e.g., Lafferty 1981; Larralde and Chandler 1981). The 
two-sample /-test (Thomas 1976:227) is a test for the difference between means, but 
use of this test requires such assumptions as normality and equal group variances. 
The Mann-Whitney test (Conover 1971:224) is anonparametric alternative, and the 
Kolmogorov-Smirnov test can be used to assess distributional differences of any 
kind (Conover 1971:309). As noted in earlier sections, however, use of these tests in 
spatial contexts 1s problematic because of positive spatial autocorrelation, which 
violates the common assumption of independence. Since the Glade Park spatial data 
are derived from cluster sampling, we might expect the level of spatial autocorrela- 


tion to be rather high. 


One way to resolve this difficulty is to treat such '< :ts conservatively by using 
the 0.005 level instead of the 0.05 level, for example. When the /-test 1s used, the 
absolute value of ¢ itself can serve as a relative index of difference or separability 
between classes. Currently there are no readily available significance tests for 
assessing class differences in spatially autocorrelated contexts (however, see Chiff 
and Ord 1975). 


A modified t-test valid for unequal group variances (Steel and Torrie 1980:206) 
was applied to the Table 8.1 data using a robust proxedure that trims the largest and 
smallest values in each grovp (Dixon et al. 1983:101), since the f-test 1s overly 
sensitive to extreme scores. The /-statistics and associated two-tailed probabilities 
are given in Table 8.1 and are presented only as a relative index of separability 
between the site and nonsite groups. The ¢ index suggests that certain variables, 
such as slope, aspect, and view, are more separated than other variables. Statisti- 
cally, the results of the ¢-test should be viewed conservatively because of violations 
of the independence assumption that result from spatial autocorrelation. Addition- 
ally, even if the data could be assumed to be independent, the resulting statistics 
would still be inflated because simultaneous inference methods (Miller 1966) were 
not employed. Besides correlation between cases resulting from spatial autocorreia- 
tion, the 14 rar.ables are also positively correlated (e.g., homzontal and vertical 
distance to water are highly related). Thus, the 14 individual significance tests are 
not independent assessments; moreover, with 14 tests some are likely to appear 
significant by chance alone. 


Multivariate Assessment 


Before attempting to model the site and nonsite differences that appear to 
exist in the Glade Park data (Table 8.1), it might be instructive to assess group 
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where Xj refers to vector of measurements of the multiple variables at locations, pp 
contains the vector of multiple mean values associated with class t, and L¢ is the 
corresponding dispersion matrix containing j rows and columns of variances and 
covariances for class # (Swain 1978:156; also see Green 1978 for a discussion of matrix 
algebra techniques). In practice, the means and dispersion matrices are unknown; 
they are estimated by sample means, variances, and covariances. An observation is 
assigned to the class for which it has the greatest probability value. 


Although discussion of matrix algebra is beyond the scope of this presentation, 
a simplified description of the technique follows. For a single variable we can 
imagine a normal probability curve with its maximum height or density at the mean 
value and with a width that is indicative of the variation in the distribution. For any 
value of a variable we can determine the density (height) of the distribution. 
Similarly, in a multivariate context multiple measurements can be assessed by the 
above formula relative to the multiple mean values for a class, considering at the 
same time the nature of the dispersion within that class, and this yields a multivar- 
iate normal density value. A density can be determined for each class under 
consideration. To illustrate with hypothetical values, if the multivariate density tor 
Class A is determined to be 0.3 for the mulrinls snvirnnem ons) 
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TABLE 8.1. 
Descriptive statistics for Glade Park sites and nonsites 








Trimmed f-test 
Variable Mean Mean Medan s.4. ip 

X, slope (‘% grade) 
sites 12 10 10 1] -5.28 0.0000 
nonsites 24 18 13 

X  rehef within 100 m (m) 
sites 30 77 24 20 -3.00 0.0030 
nonsites 39 34 3% % 

X,;  rehef within 250 m (m) 
sites 70 6&5 6! 45 -2.65 0.0085 
nonsites g5 7” 73 52 

Xq  rehef within 500 m (m) 
sites 133 130 134 63 -3.11 0.0021 
nonsites 1s) 146 146 » 

Xs rehef within 750 m (m) 
sites 180 180 183 78 -3.33 0.0010 
nonsites 217 200 1% 112 

X¢ honzontal distance to permanent water (m) 
sites 2273 1943 1950 1960 0.75 0.4566 
nonsites 2426 224 2300 1689 

Xz _ vertical distance to permanent water (m) 
sites 129 ial 9) 138 -2.16 0.0319 
nonsites 168 125 125 175 

Xg horizontal distance to nearest water (m) 
sites 164 139 100 133 -1.94 0.0535 
nonsites 194 183 200 1% 

Xg vertical distance to nearest water (m) 
sites 33 26 4 37 -2.22 0.0270 
nonsites 44 33 24 4s 

X io vantage (m) 
sites is Om SO 277 1.14 0.2556 
nonsites 130 86 100 182 

Xi) view angle (0-360°) 
sites 219 222 220 73 5.51 0.0000 
nonsites 174 178 180 @~ 

X12 aspect (0-180°) 
sites fot 57 50 48 -4.15 0.0000 
nonsites x9 87 ®” 55 

X13 shelter within 100 m 
sites -14 -9 6 43 -2.94 0.0035 
nonsites 3 «I -3 57 

Xi shelter within 250 m 
sites -140 -93 -9 546 -1.63 0.1037 
nonsites -28 -78 -73 654 
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Applying identical calculations to the 1423 sample locations yields the muitial 
model accuracy rates and gain statistic shown in Table 8.4. The results of the 
maximum likelihood technique applied to cach of the 19,000 locations in the test 
study region are mapped in Figure 8.7b. Although the classification accuracy 1s 
about the same as that provided by the discriminant analysis, note that the 
maximum likelihood procedure maps a relatively smaller portion of the region as 
site likely because it takes into account the lesser environmental variation usually 
exhibited by a site-present class while the discriminant analysis model used above 
does not. 


Logistic Regression 


Multiple logistic regression has recently come into use as a classification 
technique (¢.g., Maynard and Strahler 1981; Pindyck and Rubinteld 1976:237-263; 
Schmidt and Strauss 1975), and it has been apphied in several studies of archaeolog)- 
cal site location (Custer et al. 1983, 1986; Holmer 1982; Kvamme 1983b, 1983c, 1986; 


Parker 1985; Scholtz 1981). This nonparametrw technique makes no assumptions 
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differences by considering all available information (the 14 variables) simultane- 
ously. Hotelling’s T?, a multivanate extension of the ¢-test, and one-way multivar- 
late analysis of variance (MANUVA) are traditional parametric procedures for 
performing such a task (Morrison 1976). Recently, a nonparametric alternative has 
been presented for a similar problem context in archaeology. Multi-Response 
Permutation Procedures (MRPP) originally were introduced to archaeology for 
assessing artifact class locational differences in real space based on positional coordi- 
nates (Berry et al. 1980, 1983, 1984). MRPP can be used in the present situation to 
assess multivariate site and nonsite class locational differences in measurement space 
(Figure 8.5b). Since MRPP are based on a randomization procedure, they are 
extremely robust. If substantial class differences are found, this result would 
suggest that the site and nonsite locations occupy different regions of the measure- 
ment space. Site location modeling procedures then might have a reasonable chance 
of partitioning the measurement space into appropriate decision regions, providing 
a successfui classification model. 


The Glade Park site and nonsite iocational data were subjected to an MRPP 
analysis. The simultaneous comparison of all 14 site and nonsite environmental 
characteristics indicates an extreme difference between the two classes that was 
significant at p = 0.00000000032. This suggests that the Glade Park locations with 
sites tend to be markedly different from locations without sites in terms of envi- 
ronmental characteristics (see “Interpretation and Explanation of Data Patterns” 
for a discussion of how such data patterns can be interpreted). 


Site Location Models 


The technique chosen for site location model development at Glade Park 1s 
multiple logistic regression. This classification algorithm 1s particularly robust 
because, unlike many other classification techniques, it does not assume a particular 
underlying distributional form (Press and Wilson 1979) but achieves a partitioning 
of the measurement space (Figure 8.5b) based on the empirical distribution of the 
particular data set used (see Chapter 5 and the discussion below for more details 
about logistic regression). The following logistic regression equation was obtained 
through the BMDP program LR (Dixon et al. 1983): 


L = 0.713 - 0.0390X, - 0.00454X2 + 0.006023 + 0.00524N 4 - 0.00540X'5 -  OO0828X 6 
- 0.00184X7 - 0.000628X 8 - 0.01249 - 0.00808N 19 + 0.00748 4) - 0.0u919X 12 
- 0.0178X 13 + 0.000744X 14 


where the variables referred to by the X; may be found in Table 8.1. The value of L 
theoretically can range between positive and negative infinity; positive values 
denote locations in the site portion of the measurement space, negative values 
indicate locations in the nonsite portion, and L = 0 represents locations that fall 
exactly on the decision boundary (Figure 8.5b). Thus, L represents a decision rule 
that can be used to assign locations to site or nonsite classes on the basis of their 
measurements. Additionally, large positive or negative values indicate locations 
having characteristics that, overall, are more like the site or nonsite classes, 
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A logistic regression analysis was performed using the example data of 1423 
site-present and site-absent locations and the BMDP program LR (Dixon et al. 
1983), and the result was the following function: 


Lj = -6.8837 - 0.0043N }, - 0.114N'2) + 0.0277N 3 - 0.0136N' 4; + 0.00164N'5, - 0.0006246N gy 
- 0.0043X7; - 0.000777 g, 


When applied to the measurements from 5L.A5364 (Table 8.3), these equations yield 
L = 1.9085 and p = 0.8708. Based on its environmental characteristics, 51.A5364 would 
be correctly classified to the site-present group. 


Model accuracy for the logistic regression application, as measured by the gain 
statistic in Table 8.4, 1s slightly higher than it was for the previous, parametric 
techniques. Figure 8.7c shows the results of mapping the logistic regression model 
over the test study area. Since logistic regression makes no assumptions about 
distributional form, it 1s usually regarded as a very robust procedure. This would 
appear to be an advantage for archaeological locational modeling because site 
location data are decidedly nonnormal, but in fact, the application of this technique 


to the sample data produced results that are very similar to the results obtained bv - 
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respectively, than locations with small positive or negative values. L therefore 
represents a single scale or axis representing an underlying environmental contin- 
uum with “‘site-favorable’’ conditions on the positive extreme and “site- 
unfavorable”’ (nonsite) conditions on the negative extreme. 


In practice, the use of L is unwieldy because its values are unconstrained. A 
simple transformation yields a value that ranges from 0 (large negative values of L) 
to | (large positive values of L), with 0.5 indicating locations on the decision 
boundary (L = 0). 


’ (Ly) 1 


P; = Eee se 
1+¢ily) 1 +e 





where L, is the logistic regression score mezsured at the i” location and p; is the 
transformed value. Note thar -f (a) the data are obtained through simple random 
sampling (generally impractx al in archaeology, as noted above), and (6) the data 
represent independent observations (generally impossible owing to the spatially 
autocorrelated nature of archaeological data), then the p; values can be interpreted 
as estimates of a location’s probability of membership in the site-present class 
conditional on the measurements (X;) made at the location. Since these two 
conditions are not met in the Glade Park analysis, the p; values can best be 
interpreted as standardized relative indications of location within the site-present 
or site-absent portions of the measurement space. 


To illustrate use of these formulas, suppose that a location 1s found to exhibit 
measurements on 14 predictor variables identical to those presented for the site 
group mean values in Table 8.1. When the site mean values and the first equation 
are used, L. = 0.6331; the second equation gives p = 0.6532. Thus, a location with those 
environmental charactersstics would be assigned to the site-present group since p > 
0.5. When this procedure ts applied to the measurements of all 314 site and nonsite 
locations, the initial classification results are as shown in Table 8.2a. The percent 
correct statistics in this table are undoubtedly inflated because the same data were 
used both to build the model and to yield these performance indications (a model 
like logistic regression tends to maximize fit to the particular data at hand). Ina later 
section, independent tests are applied in an attempt to assess the “true” perform- 
ance of a Glade Park model. The gain statistic for this model is 0.60. 


One problem in applying a model such as the one presented in the above 
equation vs that measuring many variables and performing many calculations 
requires much work, even for a computer. A common data reduction technique 1s 
principal components analy sis (Morrison 1976) by means of which the variation ina 
large number of variables is typically summarized by a smaller number of dimen- 
sions (principal components), which are linear combinations of the original varia- 
bles. This technique is also used to eliminate redundancies resulting from inter- 
correlations (collinearity) among variables. The reduced number of components 
can be used as predictor variables in classification analyses (Schowengerdt 1983: 160). 
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section on “‘Approaches Based on Trend in Location Only,” the quadratic proce- 
dure does not produce a model that can be readily interpreted. Finally, in an 
application of the technique to archaenlogical site and nonsite data, I found it to be 
overly sensitive to outhers, which offsets most of the advantages gained through the 
inclusion of the extra terms (Kvamme 1983c). 


Some Simple Classification Models 


The models of the previous section constitute one set of approaches for 
partitioning the measurement space (Figure 8.5b) to achieve classification. When 
appropriate theoretical assumptions (such as multivariate normality) are met for 
each of these models, classification error in the partitioning that 1s obtained 1s 
minimized. As noted in earlier sections, however, many of these assumptions are 
difficult to meet when one is dealing with geographically distributed phenomena. 
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TABLE 8.2. 
Classification performance of initial Glade Park models 





A. 14-Variable Modei 


Preducted Group 
Site Nomute 
Actual Group p2os p< 065 
Site Number 107 3» 
Percent (66.2) (31.8) 
Nonsite Number 43 14 
Percent (77.4) (726) 


Gain = 1 - (27.4 68.2) = 0.00 


B. 9%-Variable Mode! 


Preducted Group 
Sate Nowmsate 
Actual Group p-o0s p< 05 
Sue Number 110 
Percent (70.1) (29.9) 
Nonsite Number 53 104 
Percent (33.8) (66.2) 


Gain = | - (33.4 70.1) = 0.52 





Principal components analysis has not been used extensively in site location model 
development, principally because it is very difficult to interpret the meaning of the 
components obtained. Moreover, in order to obtain component scores for each case 
(location) to which the model might be applied, the technique requires measure- 
ments of the original variables anyway, and thus there is little savings in time and 
effort. 


Various stepwise procedures present an alternative (see Chapter 5). These 
techniques attempt to select a “best” subset of variables for a model. Best in this 
case means that the addition of other variables will not substantially improve the 
model because they contain only redundant information (owing to intercorrela- 
tions). In forward stepping, the first step selects the single variable that offers the 
maximum discrimination between groups (indicated by some statistic) and enters 
this variable irto the model. The second step selects the variable from the remain- 
ing pool of variables that offers the greatest increase in discrimination between 
classes by . cnsidering the relationship of this second variable with the variable 
already in the model and with variables not yet in the model. On each succeeding 
step additional variables are selected and entered into the model until the remain- 
ing variables (those not yet entered) are determined to contain only redundant 
information (again owing to correlation between them and the variables already in 
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d2, = |(31 - 79.855)? + (5 - 4.3086)? + (24.4 - 13.575)? + (30.5 - 27.6383)? + ( 6104 
~ 5908.132)? + (72 - 472.5353)? + (144 - 147.5911)? + (144 - 392.7546)?] 7 


= 513.0281 


d2_; = [(31 - 97.1958)? + (5 - 4.4515)? + (24.4 - 11.263)? + (30.5 - 24.705)? + (6108 
- 5731.842)' + (72 - 916.4705)? + (144 - 365.1802) + (144 - 794.7305)?] 


= 1152.6467 


Since 42, < d2,_,, 5L.A5364 1s closer to the site group mean values in the measurement 
space and 1s assigned to the site-present class 


In actual practice the data should be standardized so that each variable 
dimension) contributes equally to the calculations. In cases where the variances for 
each variable tor each class are ecual and where the variables are uncorrelated, this 
algorithm minimizes classification error (Schowengerdt 1983:54). Even when these 
special conditions do not arise, studies have shown that the accuracy of the 
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the model). The result 1s a subset of available variables that yields a model whose 
performance may be similar to that of a model in which ail vanables were used but 
that requires less information. One drawback of stepwise procedures 1s that the final 
subset of variables obtained in a particular application can vary depending on the 
particular stepwise procedure and the selection criteria used and can also vary from 
sample to sample. It 1s usually the casc, however, that a certain core of best 


discnminating variabies 1s selected. 


The 14 variables of the Glade Park data were subjected to stepwise procedures 
using the BMDP stepwise logistic regression program (LR; Dixon et al. 1983). 
Vanables were entered at each step on the basis of largest chi-square value 
(suggesting best discrimination). A subset of nine variables was ultimately selected 
by graphing at cach step changes in several statistics, including (4) the improve- 
ment chi-square, (+) the model log likelihood, and (<) the goodness-of-fit statistic 
Ry? (described above), all of which are monotonically related functions. After the 
ninth variable was entered, changes 1n all of these statistics leveled off, suggesting 
that no substantial model improvement would occur if the remaining five variables 
were included. The resulting nine-variable model 1s 


L = -0.158 - 0.0401N) + 0.518N'2 - 0.00224N7 - 0.0133N 9 - 0.000602N 19 + 0.00738X } | 
- 0.00582N ;2 - 0.0183N 13 + O.000804N y5 


and for this model gain equals 0.52. The classification performance of this nine- 
variable model, when 1 was applied to the same data used to create it, 1s shown in 
Table 8.2b. (Independent tests ot Glade Park models are given below.) A conupan- 
son of the nine-varable model shown here and the 14-variable model described 
above shows that not only are the models very similar (in terms of the coefficients) 
but so are their suggested performances, as indicated by the statistics in Table 8.2. 


APPLICATION COMPARISON OF QUANTITATIVE 
LOCATIONAL MODELS 


In this section several forms of quantitative data models of characteristics of 
locations that have been used in archaeological research are presented and com- 
pared. Each type of model achieves a partitioning of the measurement space (Figure 
8.5b) in a different way. The goal of this section is to demonstrate the broad 
similarity of these diverse modeling techniques and of their results. For comparison, 
each modeling technique ts mapped across the same study region using GIS 
computer mapping techniques (see Chapter 10); the patterns of the mappings are 
often strikingly parallel. The results of this section support the conclusion alluded 
to earlier and arrived at by Hixon et al. (1980): the particular classification algo- 
rithm used 1s /ess important than the representativeness of the samples used in 
predictive model development. 


All models presented in this section were developed using the same data trom a 
Colorado plains study region (Kvamme 1984, 1986). This study region of nearly 575 
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km? was gridded into approximately 230,000 cells (land parcels), each measuring 50 
m on a side; these cells were the elementary units of analysis. The archaeological 
data used for the models consisted of 269 locations (cells) containing open-air lithic 
scatter sites. A background control group of 1154 locations without archacoiogical 
remains (nonsites) was also used. These large sample sizes should help to illustrate 
the relative performance of each modeling procedure. Eight environmental varia- 
bles calculated at these iocations by a computer through GIS techniques formed the 
data base, and in all models all exght variables are used for consistency in compan- 
son. The variables are aspect (X 1), slope (X2), local reef within 100 m (X3), local 
reliet within 300 m (X4), a canyon mm index value (the “shelter’’ volume measure 
described above; V5), distance to nearest point of vantage (mesa edge, canyon mm, 
or hill or ndge top; X6), distance to the closest drainage (X7), and distance to the 
closest second-order (or greater) drainage (using Strahler order ranking; Xg). The 
reader 1s referred to the section “Variables Used In Locational Research” for a 
description of how -hese variables are measured. 


Robust Classification Models 


Robust classification models can be grouped according to two types— 
parametric: and nonparametric. Parametric techniques assume a particular type of 
statistical distribution (usually multivariate normality) and then estimate parame- 
ters of that distribution (¢.g., means, variances, and covariances). Nonparametric 
classification procedures make no assumptions about distributional form and are 
sometimes considered particularly robust because they work under a wide range of 
distributional types (if the groups to be classified are reasonably distinct). It should 
be noted that under the same conditions (varied distributional types) parametric 
methods usually provide good results, even when the assumed (multivariate nor- 
mal) distribution does not occur (Schowengerdt 1983:176). 


Two parametric techniques, a linear discriminant function (commonly called 
discriminant analysis) and Bayes’s maximum likelihood, are compared below with 
logistic regression, a nonparametric technique. The section ends with a brief 
discussion of a quadratic classification technique. 


Discriminant Analysis 


Because many archaeologists are familiar with discriminant analysis and 
because the software needed to use this technique 1s readily available in common 
statistical packages, it has been the dominant technique used for site location model 
development (Kvamme 1980; Larralde and Chandler 1981; Peebles 1981; Schroedl 
1984). 


The overall strategy of discriminant analysis in the two-group situation entails 
summarizing class differences by a linear combination of the original and multiple 
variables, where each observation is assigned a score on the single resulting 
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dimension or discriminant axis. The discriminant function has the characteristic of 
maximizing the separation between groups along the axis, assuming multivariate 
normality and equal group variation. A maximum likehhood technique ts then used 
on the discriminant axis to evaluate probabilities of group membership (see Chapter 
5). 


When the example data of 269 site and 1154 nonsite locations are used, the 
following discriminant function ts obtained through the BMDP discriminant analy- 
sis program 7M (Dixon et al. 1983). 


D, = -5.7058 - 0.0047 1, - 0.08N'2; + 0.05N 3, - 0.0104N 4, + 0.00131N 5, - 0.0002N gy 
- 9.001N7, - 0.0008X'g, 


where D, is the discriminant score for the: case (location) and Vj; through Xg are 
the variables defined above. Like L, D can range between positive and negative 
infinity. A simple transformation yields a value p, which ranges from 0 to 1, allowing 
interpretation (when the assumptions of this model are fully met) as the probability 
of a location’s membership in the site group, conditional solely on the measure- 
ments. This transtormed value 1s calculated as tollows: 


0.5 (Dj - Dy} 





P; = 
4 0.5 (D, = D, 5 a a -0.5 (D, = Dy; ) 


where D, is the estamated mean (centroid) of discriminant scores for the site group 
and D,, 1s the mean discriminant score for the nonsite group. 


To illustrate use of these tormulas, the environmental characteristics of site 
§LA5364, one of the 269 sample sites, are shown in Table 8.3. When these data are 
used, the first equation yields a discriminant score of D = 2.1909. Ifthis value and the 
site and nonsite sample centroids on the discriminant axis (D, = 0.8304, Dy, = -0.1936) 
are inserted in the second equation, then p = 0.8719. Since p 1s greater than 0.5, which 
is the traditional decision rule, this location would be correctly classified as a site. Of 
course, this model can be used tor prediction when it 1s applied to locations of 
unknown group membership. 





Replicating this procedure for the 1423 site and nonsite sample locations yields 
the initial model accuracy indications (percent correct statistics) shown in Table 
8.4. The gain statistic can be estimated from these data. The percentage of the total 
area covered by the model (at the p = 0.5 cutoff) can be estimated by using the 
percentage of nonsite locations ciassified by the model to the site ,roup, 32.1 
percent in this case. We can use this figure as an estimate because in the region 
under study (as with most regions) the area occupied by site locations constitutes 
only a tiny percentage of the study area—in the present case about | percent of all 
the locations (50 m cells) in the region, which means that the nonsite locations 
represent about 99 percent of the total area (see Kvamme 1984). Thus, it the model 
classifies 32 percent of the nonsites to the site group, we can infer that approxi- 
mately 32 percent of the total area of the region would be classified in the site group 
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TABLE &.3. 
Values for exght environmental variables: 5LA5364, all 269 sites, and 1154 nonsie locations in 


the Colorado plains study region 





M_ 45304 Ali Sate Ab m4 
Eeartroemratal I ariahi: Le x 3 © 3 
X I aspect 3! 9 ASy 52 79% 97 1% 52.4157 
Ny —s shope grade 5 +m 4.1598 4452 & 7874 
X\, trebet unthen 100 m m M4 13.5730 10.4165 11.263 7 4¥i2 
4 tebet wiathen 300 m m 30.5 27 H383 18.3213 24.705 27.2067 

X- canyon mm mdes ate} 3908. | 32% wy) 7M? 5731) 842 92.194) 

mm pUR 
Xx, \antage dutance (m 724 472.539 637 4297 vin 471 841 7876 
\- distance to closest 

dramage (m ss 187.5910 165.4219 305. 1 Ms. 7622 
Xs distance to scocond-orcrr 

drainage m \44.6 392.7550 405 Sve Tea 43) il? 3548 





by the model if it were mapped. The percentage of all sites within the model's area 
1s estimated by the percentage of sites correctly classified by the model to the site 
group — 76.2 percent for the current model (Table 8.4). Based on these calculations, 
the discriminant model yields a gain of | - (32.1 76.2) or 6.579. It should be noted that 
both the percent correct and the gaon statistics presented in Table 8.4 are inflated 
owing to a variety of factors, the most notable of which 1s that the same data were 
used both to build the model and to evaluate its performance (procedures given 
below help to correct inflated statistics through independent tests). 


We can illustrate the application of this model when mapped over a region by 
using the Colorado plains 5.5 by 8.5 km test study region discussed in the section 
entitled “ Approaches Based on Trend in Location Only.” This region, which can be 
characterized as a level plain dissected by a number of deeply entrenched canyons, 
represents only a small portion of the larger study area trom which the sample data 
were derived. Approximately half of the 95 sites in this test study region are 
contained in the larger sample of 269 sites used tor development of this model. 
Computer measurement and mapping techniques in the form of a geographic 
information system (see Chapter 10) were used to estimate values tor the eight 
variables in each 50 by 50 m cell and to map the results of the model over the 
approximately 19,000 cells of this test region (Figure 8.7a). 


In this figure, and in those that illustrate mapping of the subsequently 
discussed models, estimated probability values are portraved in five levels (in steps 
of 0.2 and in levels of increasing darkness). Thus, the traditional p = 0.5 decision rule 
hes midway within the second level of shading. The actual site locations in this 
study region were shown in Figure 8.44 and may be compared with this predictive 
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Comparison of classification performance rates of several sate location modeling procedures 
(row percents are given in parentheses) 
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gain = 0.482 gain = 0.467 gain = 0.373 








map. Since discriminant azialysis assumes equal class variation, achieved by pooling 

) the sample covariance matrices, a greater proportion of the environment tends to be 
classitied with higher p-values (as indicated by the extent of the shaded regions) 
with this technique than with other methods described below that do not make this 
assumption (compare Figures 8.7b and c) 


Maximum Likelihood Classatrer 


The maximum likelihood classifier is the most commonly used classification 
procedure in many disciplines, particularly in remote-sensing modeling applica- 
tions (Mork 1980; Schowengerdt 1983), although it has been used less treg vently in 
archacological site location studies (Morain et al. 1981). 


The probability that an observation belongs to the # class, according to 
multivariate normal theory, 1s described by the following function: 





4; (X,) = ars p= 2 (Ky - wae)’ Syn! OG - aed] 
2er)) 7?) La’? 





367 








DEVELOPMENT AND TESTING OF QUANTITATIVE MODELS 


Level Slice Classifier 


This algorithm is sometimes called the parallelepiped clavafier (Moik 
1980:271 -272). It establishes decision boundaries that are parallel to cach axis in the 
measurement space (hence the term Jere slice) by forming a hyperrectangle or 
parallelepiped about the class(es) of interest; a hypothetical level-shce decision 
boundary in a measurement space is portrayed in Figure 8.5b. In image processing, 
minimum rectangles are usually fitted around ciass boundanes derived through 
maximum likelihood estimation. The best known archacological application of this 
techmique ts the polythetic choice model of prehistoric settlement developed for the 
Reese River Valley of Nevada (Wilhams et al. 1973). This polythetic choice model 
used arbitrary cutpoints (level shces) on each of seven environmental variable axes. 
In this case, however, a location was classified to the site-present group (no actual 
nonsites were used) if any tive of the seven values measured at a location were below 
the threshold levels. The level slice classifier can be defined for any variable as 
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where X; refers to vector of measurements of the multiple variables at location #, pip 
contains the vector of multiple mean values associated with class &, and L¢ 1s the 
corresponding dispersion matrix containing j rows and columns of variances and 
covariances for class # (Swain 1978:156; also see Green 1978 for a discussion of matrix 
algebra techniques). In practice, the means and dispersion matrices are unknown; 
they are estimated by sample means, variances, and covariances. An observation ts 
assigned to the class for which it has the greatest probability value. 


Although discussion of matrix algebra is beyond the scope of this presentation, 
a simplified description of the technique follows. For a single variable we can 
imagine a normal probability curve with its maximum height or density at the mean 
value and with a width that is indicative of the variation in the distribution. For any 
value of a variable we can determine the density (height) of the distribution. 
Similarly, in a multivariate context multiple measurements can be assessed by the 
above formula relative to the multiple mean values for a class, considering at the 
same time the nature of the dispersion within that class, and this yields a multivar- 
late normal density value. A density can be determined for each class under 
consideration. To illustrate with hypothetical values, if the multivariate density for 
Class A is determined to be 0.3 for the multiple environmental measurements made 
at some location and the density for Class B is determined to be 0.2 (in a two-class 
problem), then the measurements have a higher probability of belonging to Class A 
than to Class B; in fact, the probability of membership in Class A can be estimated as 
p =0.3 (0.2 + 0.3) =0.6. The mathematics of this procedure perform optimally when 
multivariate normality and indeper.dent observations can be assumed (i.e., classifi- 
cation error 1s minimized), even though this technique does not require equal 
covariance matrices (see Chapter 5). The Statistical Analysis System PROC DIS- 
CRIM performs multivariate classification through the maximum likelihood 
method (SAS Institute 1982). 


To illustrate application of this technique, data from site 5LA5364 are again 
used (Table 8.3). Entering these data into the above equation, together with 
estimated means, variances, and covariances for the site and nonsite groups (some of 
which are included in Table 8.3), yields a density for the site group of 4.8539 « 10°?! 
and a density for the nonsite group of 3.4153 x 10??. Thus, the measurements at 
5L.A5364 indicate that this location has a higher probability of membership in the 
site-present than the site-absent group, and it would be appropriate based solely on 
these measurements to assign the 5LA5364 location to the site class. The conditional 


probabilities become 


4.8539 x 10-2! 
p (site | X;) = = (0.9343 
4.8539 x 10-2! + 3.4153 x 10-22 





p (nonsite | X;) = 1 - p (site | Xj) = 1 - 0.9343 = 0.0657 
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Applying identical calculations to the 1423 sample locations yields the imitial 
model accuracy rates and gain statistic shown in Table 8.4. The results of the 
maximum likelihood technique applied to cach of the 19,000 locations im the test 
study region are mapped in Figure 8.7b. Although the classification accuracy 1s 
about the same as that provided by the discriminant analysis, note that tne 
maximum likelihood procedure maps a relatively smaller portion of the region as 
site likely because st takes into account the lesser environmental variation usually 
exhibited by a site-present class while the discriminant analysis model used above 


does not. 


Logistw Regression 


Multiple logistic regression has recently come into use as a classification 
technique (¢.g., Maynard and Strahler 1981; Pindyck and Rubinteld 1976:237-263; 
Schmidt and Strauss 1975), and it kas been apphied in several studies of archaeolog- 
cal site locaton (Custer et al. 1983, 1986; Holmer 1982; Kvamme 1983b, 1983c, 1986; 
Parker 1985; Scholtz 1981). This nonparametric technique makes no assumptions 
about distributional form (WV nigley 1976 and has been shown to offer improved 
classificatory performance over discriminant analysis when the data are not multi- 
variate normal (Maynard 1981; Press and Wilson 1979). Maynard and Strahler (1981) 
argue that logistic regression ts the optimal statistical classifier for remotely sensed 
data, and because no distributional assumptions are made, this techmique 1s appro- 


priate for nominal-, ordinal-, or interval-scaled data or for various combinations of 


these levels of measurement. Logistic regression has been applied in several 
examples in earlier sections of this chapter. 


Logistic regression can be better understood if we consider the results of 


app! ing a multiple linear regression model to a dichotomous dependent variable, 
such as site presence and site absence. Such a model has a number of serious 


problems in this context (Wrigley 1976:7-9; Chapter 5, this volume), not the least of 


which 1s that predictions can range in value between plus and minus infinity, 
making it difficult to interpret these predictions as probabilities. Logistic regression 
is able to overcome these difficulties and yield a result that is constrained between 0 
and |. This result can be interpreted as a predicted probability of class membership 


(when assumptions of independence and random sampling are met) through use of 


the logistic transformation. 








where the logistically derived discriminant function at ther location 1s 


Ly =a + BX; + B2XR +... + BX 


and @ and the fy are the estimated intercept and regression weights. 
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A logistic regression analysis was performed using the example data of 1423 
site-present and site-absent locations and the BMDP program LR (Dixon et al. 
1983), and the result was the following function: 


Lj = 6.8837 - 0.0043X }; - 0.114N2; + 0.0277 3; - 0.0136N 4; + 0.00164N'5, - 0.000626. G 
- 0.0043X7; - 0.000777X g; 


When applied to the measurements from 5LA5364 (Table 8.3), these equations yield 
L = 1.9085 and p = 0.8708. Based on its environmental characteristics, 51.A5364 would 
be correctly classified to the site-present group. 


Model accuracy for the logistic regression application, as measured by the gain 
statistic in Table 8.4, 1s slightly higher than it was for the previous, parametric 
techniques. Figure 8.7c shows the results of mapping the logistic regression model 
over the test study area. Since logistic regression makes no assumptions about 
distributional form, it is usually regarded as a very robust procedure. This would 
appear to be an advantage for archaeological locational modeling because site 
location data are decidedly nonnormal, but in fact, the application of this technique 
to the sample data produced results that are very similar to the results obtained by 
the previous classifiers, both in performance (Table 8.4) and in mapped results 
(Figure 8.7). 


Quadratu Classification Procedure 


The quadratic classification method 1s a general technique that can be applied 
to such statistical models as discriminant analysis and logistic regression when 
group variances and covariances have been found to be markedly unequal. This 
procedure has been shown to offer improved classificatory performance in these 
situations (Anderson 1975; Eirsenbeis and Avery 1972:44; Michaelis 1973; Smith 1947). 
In archaeological predictive modeling, Kohler and Parker (1986) have applied 
quadratic discriminant analysis to simulated data, and I have applied quadratic 
logistic regression as a test case in actual model development (Kvamme 1983c). The 
quadratic procedure incorporates all quadratic terms (i.¢., squared terms for each 
variable and all possible interaction pairs) into a model, along with the predictor 
variables being used. This causes the decision boundary to wrap or curve around 
the group with less variation (4 hypothetical quadratic decision boundary is shown 
in Figure 8.5b), which can provide an increase in model accuracy. 


Any benefits obtained are not without cost, however. The discriminant 
analysis and logistic regression models presented in the previous sections require 
estimates of; + | parameters (where; is the number of predictor variables) to yield a 
linear decision boundary in the measurement space (Figure 8.5b). The nonlinear 
decision boundary of the quadratic model (Figure 8.5b) requires estimates of (y + 1) 
+j(j + 1)/2 parameters (thus a nine-variable model would require estimates of [9 + 1] 
+9|9 + 1]/2=55 parameters). This increase in the number of parameters may require 
a corresponding increase in sample size in order for estimation to be reliable. 
Another problem 1s that, like the polynomial regression technique discussed in the 
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section on “Approaches Based on Trend in Location Only,” the quadratic proce- 
dure does not produce a model that can be readily interpreted. Finally, m an 
application of the technique to archaeclogical site and nonsite data, I found it to be 
overly sensitive to outliers, which offsets most of the advantages gained through the 
inclusion of the extra terms (Kvamme 1!983c). 


Some Simple Classification Models 


The models of the previous section constitute one set of approaches for 
partitioning the measurement space (Figure 8.5b) to achieve classification. When 
appropriate theoretical assumptions (such as multivariate normality) are met for 
each of these models, classification error in the partitioning that 1s obtained 1s 
minimized. As noted in earlier sections, however, many of these assumptions are 
difficult to meet when one is dealing with geographically distributed phenomena. 


A number of simple mathematical rules have been developed to achieve a 
Partitioning of the measurement space in pattern-recognition and image-analysis 
studies (Duda and Hart 1973; Moik 1980; Schowengerdt 1983). These procedures can 
be classed as nonparametric because no assumptions are made about probability 
distributions, and in some cases they perform with accuracy rates comparable to 
those of the models discussed in the previous section. An important advantage of 
these procedures is that they are easier to calculate (many can be done by hand) 
than the computationally burdensome procedures described above. Although a 
wide range of possible examples exist, only two are discussed here: the minimum 
distance classifier and the level slice classifier. 


Distance Measures 


The minimum distance algorithm simply classifies a location to the class that it is 
“‘closest” to in the measurement space (Figure 8.5b). In other words, a location 
(with characteristics summarized by measured variables) is assigned to one of the 
classes if its distance from the center of that class is less than its distance from the 
center of the other class(es) (Schowengerdt 1983:49-53). The center of each class 1s 
represented by the point in the measurement space having the class mean value for 
each variable under examination. The distance from the #” class is given by 


” 
dp = |(x hy - ik)? + (02) - wre)? +. ~~ + Cop -myh)?| 
which is simply the Euclidean distance between the values of the; variables (x1, x2, 
. . -j) measured at the 7” location, and the mean values for each variable (yz, 22, - - -, 
Hj) "a the &” class. 


To illustrate application of this algorithm, the measurements for 51.A5364 and 
the estimated means for the sample site (;) and nonsite (#) classes (Table 8.3) can be 
entered into this equation to yield the following: 
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42, = [(31 - 79.855) + (5 - 4.3086) + (24.4 - 13.575)? + (30.5 - 27.6383)? + (6104 
- 5908.132)? + (72 - 472.5353) + (144 - 147.5911)? + (144 - 392.7546]? 


= 513.0281 


22g, = [(31 - 97.1958)? + (5 - 4.4515)? + (24.4 - 11.263)? + (30.5 - 24.705) + (6104 
- 5731.842)? + (72 - 916.4705)? + (144 - 365.1802)? + (144 - 794.7305)?] 


= 1152.6467 


Since 42, < 42,_,, 51_A5364 1s closer to the site group mean values in the measurement 
space and 1s assigned to the site-present class. 


In actual practice the data should be standardized so that each variable 
(dimension) contributes equally to the calculations. In cases where the variances for 
each variable for each class are ec ual and where the variables are uncorrelated, this 
algorithm minimizes classification error (Schowengerdt 1983:54). Even when these 
special conditions do not arise, studies have shown that the accuracy of the 
minimum distance classifier 1s comparable to that of the maximum likelihood 
method (Hixon et al. 1980). The minimum distance classifier may be calculated 
using the Statistical Analysis System PROC NEIGHBOR program (SAS Institute 
1982). 


Figure 8.8a maps the results of applying the minimum distance (22) classifier to 
the 19,000 locations in the test study area described above. The shaded areas are 
those portions of the region that were classified as being closer to the site class mean 
values in the measurement space than to the nonsite values. This “site-similar™ 
region compares favorably with the site subspace delineated by the statistically 
derived maximum likelihood classifier (for comparative purposes, the site subspace 
defined by maximum likelihood as all locations with conditional site probabilities 
> 0.5 is portrayed in Figure 8.8d). Similarly, the classification accuracy and gain 
statistic of the results of applying the 22 classifier to the sample data of 1423 locations 
(Table 8.4) compare favorably with those of the multivariate data analysis proce- 
dures of the previous section, if the relative difficulties of the two types of 
procedures are taken into account. 


A slightly different distance measure, termed city block distance (Schowengerdt 
1983:51), is merely the sum of absolute distances from the 1” location to the class 
mean values: 


dig = |x; - wid + ley - wat... + ji -ayl 


This distance measure is somewhat easier to employ than the Euclidean distance 
measure because it requires fewer calculations. When mapped across the study 
region (Figure 8.8b) this decision rule yields results almost identical to those of the 
42 classifier (Figure 8.8a). The classification accuracy and gain statistic for the 
application of the dl rule to the 1423 sample locations are given in Table 8.4, and 
these, too, are almost identical to those for the 22 results. Despite these similarities, 
the 22 rule is more commonly used because it 1s more interpretable and tends to 
perform somewhat better than the dl rule. 
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Figure 8.8. Simple mathematical models based on characteristics of locatsons (exght environmental vanables): (A) menemum (Euchdean) distance (42), 


B) minimum (city block) distance (41), (C) lewel-shoe classifier (+ 1.75 «4. ), (D) maxemum likelihood classifier at p20 ssiaeeeail 
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Figure 8.4. Continued. 
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Level Shoe Classifier 


This algorithm is sometimes called the paralleemped daufier (Mork 
1980-271 -272). It establishes decision boundanes that are parallel to cach axis in the 
measurement space (hence the term /ere/ due) by forming a hyperrectangle or 
parallelepiped about the class(es) of interest; a hypothetical level-shce decision 
boundary in a measurement space 1s portrayed in Figure 8.5b. In image processing, 
minimum rectangles are usually fitted around class boundanes denved through 
maximum likelihood estimation. The best known archacological apphcatien of this 
technique 1s the polythetic chowce model of prehistornc settlement developed for the 
Reese River Valley of Nevada (Wilhams et al. 1973). This polythetc choice model 
used arbitrary cutpoints (level shoes) on each of seven environmental variable axes. 
In this case, however, a location was classified to the site-present group (no actual 
nonsites were used) if any five of the seven values measured at a location were below 
the threshold levels. The level slice classifier can be defined for any variable as 


k-f)S 4a, (x +72) 
where x, 1s the value of a variable at the 1 location, x 1s the estimated mean value for 
the class of mnterest, and 7 1s a threshold or cutpomt value (see Moik 1980:273). 


Toillustrate apphcation of this method with our example data, measurements 
from 51A5364 and the mean and standard deviation data from the site group sample 
were used to produce the results shown in Table 8.4. The threshold value, 7, 1s 
arbitrarily set at + 1.75 standard deviation of the site group mean. Note that any 
threshold value may be selected and that this choice will directly affect subsequent 
performance; in the present case, several values of f were examined before the t 1.75 
s.d. value was selected because of its relatively good performance. Inserting the 
relevant data tor cach vanablie, we tind that the following relations hold: 


F td. 5145364 r td. 
79.8550 - (1.75)52.7906 <= 31 S 79.8550 + (1.75)52.7906 
4.3086 - (1.754.158 <= 5 S 4.3086 + (1.75)4.1598 
13.5750 - (1.75)10.4165 < 264 S 13.5750 + (1.75)10.4165 
27.6383 - (1.75)18.3213 S 305 S 27.6383 + (1.75)18.3213 

5908.1320 - (1.75)460.7342 < 6104  < 5$908.1320 + (1.75)460.7342 
472.5353 - (1.75)}637.6298 < 72 < 472.5353 + (1.75)637.6298 
147.5911 - (1.75)165.4219 <= 144 < 147.5911 + (1.75)165.4219 
392.7546 - (1.75)465.3908 < 144 << 392.7546 + (1.75)465.3968 


Thus, 51L.A5364 ts classified to the site group. 


Application of this procedure to all 1423 locations in the sample yields the 
accuracy rates and gain statistic given in Table 8.4 for the +1.75 s.d. threshold 
value. When the level slice ts apphed to cach of the 19,000 locations of the test study 
region, the resulting mapped subset (Figure 8.8c) 1s very much like those resulting 
from the apphcation of minimum distance (Figure 8.8a and b) and maximum 
hkelihood (Figure 8.8d) classifiers. As with all of the above techniques, it 1s quite 
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easy to alter results in esther direction samply by changing a cutoff pot or 
threshold value. 


COMBINING MODELS FOR LOCATIONAL CHARACTERISTICS 
AND MODEL* FOR LOCATION ONLY 


A fundamental dichotomy m types of archacological locational modeling 
approaches was established carly m this chapter. Models were classified as those 
based only on locatsona! data ( spatial x,» coordinates) or those based on characteris- 
tics of the locations, such as environmental information. A fourth-order polynomial 
logistac regressson model was presented as a model based only on locational data mn 
the section “Approaches Based on Trends m Location Only.” The preceding 
sections have illustrated several approaches tor models based on the environmental 
charactenstics of locatioms. Since both approaches to modeling provide 
information — and generally independent information — summarizing where 
archacological sites are located, « would seem a logical step to combine these 
approaches in order to enhance our ability to model prehastoric site distributions. 


To conduct this expermment the Colorado plains study region 1s used again. 
Unhke the analyses in the previous section, which utidized 1423 site and nonsite 
locations from the entire 575 km? study region, the present analyses make use only 
of the samples of 95 site-present and 54 site-absent locations from the 5.5 by 8.5km 
study portion of the larger region. (The smaller region has been used mi Figures 8.4, 
8.7, and 8.8 to portray various model mappings.) This smaller region 1s used here for 
two reasons. First, the logistic trend-surtace techmaue for location only 1s best 
suited for modeling the reduced complexity of a smaller region (and such a model 
has already been established for the present region im Figure 8.4). Second, the 
environmentally based models of the previous section were based on locational 
patterns from a collection of sites from a huge region; therefore, these models 
averaged the locational pattern of all the sites and nonsites from the wider regyon. It 
1s germane to illustrate the power of the environmentally based approach by fitting 
such a model to a relatively small region, which must contain a lower degree of 
environmental variability than the larger study area and therefore offer the poten- 
tial of a tighter fit of the model to the data. 


In order to facilitate comparison of these modeling approaches, the locations of 
the 9 site-present cells (out of nearly 19,000 cells) in the study region are shown im 
Figure 8.9a. The logistic trend-surface model derived through use of the fourth- 
order powers of the spatial (x,y) coordinates of the 149 site and nonsite locations is 
shown again in Figure 8.9b. The classification accuracy of this Jocation-only model for 
these data 1s given in Table 8.5. The pseudo-R? goodness-of-fit statistic (defined 
earher) tor this model 1s Rp’ = 0.5318, and the gain statistic 1s estimated as | - 31.5 82.1 
= 0.616. 
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TABLE &.5. 

Comparison of classification performances of sie location models for (a) locational coordinates, 
(b) locational charact—ristics, and (c) locational coordinates and characteristics for a portion of 
the Colorado plains study area (row percents given in parentheses) 





Loaational Coordsmate: Lecatsonal Characteratus Lacational Coordnate: and 
Predation Predu tion Charactteritus Predsctiom 


Sure N omute Sate Nomuate Sate Nenate 


Actual p2>O05 p< 05 Actual p2>o0s p< os Actual p2-05S p< 065 


Sue 7 17 Sue a7 ‘ Sue » 5 
&2. 79g 916 a4 94.7 5.3) 
Nonsiute 17 37 Nonsite 9 45 Nonsite 7 47 
315 68.5) 16.7) 83.3) 13.0) (57 9) 
R2'= 0.5318 R2=(07 2= 6. 
? , 1% Re &Os | 
gain = 0.616 gain = 0.818 gain = 0.863 





A logistic regression model for environmental locational characteristus was also 
fitted to the site-present and site-absent data in the 5.5 by 8.5 km study region 
using the eight variables described in the preceding sections (Figure 8.9c). Since 
this model 1s based only on the data patterns of the 95 sites and 54 nonsites in the 
smaller study area and not on all of the 1423 site and nonsite locations of the larger 
region, the resulting model provides a much tighter fit to the site data chan the 
previous logistic regression model (Figure 8.7c). The classification accuracy statis- 
tics for this locational characteristics model are given in Table 8.5; Ry? = 0.7156 and 
gain equals 0.818. The fact that environmental characteristics of locations provide 
more information than simple locational coordinates is amply illustrated by compar- 
ison of the resultant models (compare Figures 8.9b and 8.9c). 


Finally, a mode! was developed that combined both positional data and infor- 
mation about locational characteristics. This was accomplished by utilizing the 14 
polynomial terms of the location-only model and the eight environmental terms of 
the locational characteristics model simultaneously in a single logistic regression 
model (Figure 8.9d). The results of this model exhibit characteristics of both the 
trend-surface and the environmental models, as indicated by the mappings (Figure 
8.9b-d). By incorporating both sources of information, classification accuracy 1s 
increased (Table 8.5), as suggested by the higher goodness-of-fit (Rp? = 0.8081) gain 
(0.863) statistic, and model mapping. 


MODELING INDIVIDUAL SITE TYPES 


The methods discussed in the previous sections may be applied not only to 
questions of site presence and absence but also to modeling multiple site types 
within a region, as has been noted. These might be functional site types or site 
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types representing different chronological penods, for «example. The major problem 
in developing locational models for individual site types lies not in the methodolog:- 
cal difficulties of developing the models but in the definition and operationalization 
of meaningful site-type categories and in acquiring sufficiently large samples of the 
types for analysis. These problems were discussed in greater detail in the introduc- 
tion to this chapter. 


When dealing with probabilistic locational models of archaeological phenom- 
ena, it 1s often desirable that the individual class p-values (probabilities of class 
assignment) for all of the classes under consideration sum to 1.0 for any location to 
which the model(s) are applied. A number of standard and not-so-standard proce- 
dures exist that allow one to constrain estimated probabilities from multiple groups 
(¢.g., site-type groups) to sum to 1.0 at a location. 


The simplest model 1s one that assumes for any location (a small land parcel, 
such as an acre) a limited and finite number of possible outcomes and then estimates 
the probability of any given alternative (¢.g., by means of some of the modeling 
procedures discussed above). In an archacological context the alternatives that may 
occur at a location include an alternative for cach possible site type (including 
isolates or other remains) and the alternative of no site (no archacological remains of 
any kind), and 


plas) + pst) + plst2) +... + plit_) = 1 
where m 1s the site-absent alternative and the st; refer to the individual site types. 
This model assumes that all possible site types have been specified, but this 
difficulty may be circumvented simply by defining a type called “other.” This kind 
of model is assumed by many packaged computer programs for statistical analysis, 
including those for multigroup discriminant analysis (¢.g., PROC CANDISC; SAS 
Institute 1982). 


An alternative model that perhaps offers a number of advantages, given our 
limited knowledge of the past and the normal difficulties of dealing with the 
archacological record in terms of defining site types, us a hverarchical model that first 
assumes only two possible outcomes at any given location (again, a small land parcel, 
such as an acre). One outcome is that some evidence of human activity (Aa) occurs at 
the location (1.¢., some kind of site or cultural manifestation will be found there); 
the other 1s that no evidence of activity (#) occurs at the location, and 


plas) + plha) = 1 
Outcomes indicating specific kinds of activities, archacologically represented by 


functional types of sites or remains, are then conditional on the outcome that 
evidence of human activity 1s indicated (Wrigley 1982), and 


plist) + pst) +. . . + plitg) = p(ha) 
The term p(ha) refers to a human activity space within which all activity in a region 
is conducted (see also the introduction to this chapter). This space is represented 
archacologically by all material culture remains, including settlements, sites of 
specific function, and isolated occurrences (see Kvamme 1985a for a more detailed 
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discu; wn of this concept). Although some researchers might argue that in certain 
regions past activity occurred everywhere, the concept of importance here ts that of 
activity densities: im amy region certain locations may have been more favorable for 
activity of any kind than others (¢.g., locations with level ground surfaces). In fact, in 
mountainous regions or in regions containing significant acreage of swamplands, for 
example, the human activity space can be restricted to a major extent. Site location 
models that lump sites of all types into a single analytical group— whether because 
of an inability to form meaningful types from the available evidence or as part of a 
preplanned research tactic —are simply developing models of the human activity 
space. Such models should, in principle, demonstrate less-pronounced patterning 
than models for specific types of sites since the former incorporate many types of 
sites with varied locational requirements. Nevertheless, strong and predictable 
patterning can sometimes be achieved (¢.g., Figure 8.9d portrays a remarkably 
strong model for all open-air lithic scatters in the region even though these scatters 
undoubtedly represent a variety of functional site types). This hierarchical scheme 
has the advantage that locational models for specific site types may be incorporated 
as well (e.g., at a later tame). The site-type models, however, are conditional not 
only on the environmental and other measurements upon which they are based but 
also on p( ha). 


The following illustration of locational models for multiple site types uses data 
from a study of German Mesolithic sites by Kvamme and Jochim (1988). This study 
was used as an example of model building with existing data in Chapter 7. As noted 
there, the data available for the 170 known sites in the region were extremely 
limited, making it difficult to distinguish site types with specific functions. The 
amateur collectors who discovered them, however, reported a number of sites that 
contained relatively abundant remains. Of the 170 sites, 39 could be categorized in a 
“settlement” group and 74 were classed in a “small-sites” group (the remaining 
sites were unclassified or represented “isolates” and are not used here). Although 
the validity of these site types is questionable, we can assume for the purposes of 
this discussion that the types are valid and use these data to illustrate the simultane- 
ous modeling of multiple site-type groups within a single region. 


An advantage of using these Mesolithic data as an example of site-type 
modeling 1s that a computerized GIS has been established for the entire 940 km? 
study region (see Chapter 10 for a discussion of geographic information systems). 
The GIS contains a gridded representation of the study region comprising approxi- 
mately 84,000 cells measuring 100 m on a side. Within each grid cell values were 
estimated for elevation, slope, aspect, local relief, a measure of view, a measure of 
shelter, horizontal distance to nearest water, vertical distance to nearest water, and 
horizontal distance to nearest third-order stream. Additionally, grid cells that 
contain Mesolithic sites were denoted and information on the site types present in 
the cell was encoded. The mathematical details of the site-type models are pre- 
sented in Kvamme and Jochim (1988) and K vamme (1986); for the purposes of this 
discussion, “pictures” of each model will be used to indicate the results of site-type 
modeling. These mappings were accomplished by applying the models across the 
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entire study area (i.c., in each of the 84,000 cells) and using computer cartographic 
techniques to display the results. 

The site location models were developed by contrasting the 39 settlement 
locations (cells) and the 74 small-site locations against a representative sample of 
3201 locations taken from the background environment at large, rather than from a 
group of known site-absent locations (because the latter information was unavaila- 
ble). The site types could be contrasted with the background environment because 
Mesolithic sites of any type could be argued to be an extremely rare phenomenon, 
causing the background environment to constitute a reasonably distinct class for 
analysts purposes (in other words, in the sample of background locations only a very 
small percentage of the selected locations could be expected to contain as yet 
undiscovered Mesolithic sites by chance). The classifier used was logistic regression 
(discussed above), and all nine variables listed above were imcorporated in the 
models. 


The study region consists of high ndges and plateaus overlooking a number of 
river valleys and plainshike areas (Figure 7.4, this volume). The analysis suggested 
that sites of the settlement class tend to be located at lower elevations, on less 
sloping ground, in regions of less local relief that were more sheltered (1.c., less 
likely to be on hilltops and more likely to be on valley bottoms), and in places with 
lower values for the overall view measure than the small-site class. Additionally, 
sites of the settlement class tend to he closer to relatively secure water sources, 
including mayor drainages (third-order streams), although they did exhubut a slight 
onentation toward location at greater distances from nearest drainages when 
compared with the small-site class. 


These findings are largely borne out by the site type locational models when 
they are mapped over the entire region through a GIS (Figure 8.10). The map of the 
model for the settlement class (Figure 8. 10a) shows a locational pattern (the darker 
regions ) emphasizing areas along a major drainage near the southwest border of the 
region and in a plainslike low-clev ation area in the far western portion of the region. 
The locational pattern mapped for the small-site class (Figure 8.10b) does not show 
an emphasis on these areas. More important, however, the locational models 
indicate that the settlement class is much more highly patterned in terms of location 
than the small-site class, as indicated by the relative sizes of the dark areas in the 
two maps. In other words, the settlement class tends to exhibit a tighter and more 
restrictive locational pattern while the small-site class pattern seems to be more 
variable. 


Elsewhere | have shown with regional survey data from the western United 
States that large sites or settlements do indeed tend to be more patterned in 
location than smaller sites, which as a group are functionally more vanable 
(Kvamme 1985a). In that study, it appeared that the greater functional variability 
within a small-site class led to greater locational variability (presumably owing to 
different locational requirements of individual functional site types that were 
pooled within the small-site class) and that these sources of variability caused the 
less-pronounced locational patterning of these sites. The large-site or settlement 
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class, on the other hand, possibly consisted of sites representing a more simular 
range of activities (¢.g., extended occupation) with similar and thus tighter loca- 
tional requirements. Although the integrity of the site classes and sample in the 
Mesolithic study may be questioned, the patterning discerned does resemble the 
large-site small-site patterning discussed here 
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INTERPRETATION AND EXPLANATION OF DATA 
PATTERNS 


The foregoing sections have presented a number of quantitative data analyses 
techmques that are relevant m the development of archacological locational models. 
As scoentists, archacologists are also interested in (4) methodological mgor and (6) 
explanation. The methods presented offer great potential for both. The most 
obvious benefit obtained from use of quantitative methods of data analysis 1s that 
research findings can be obtained with greater objectivity. Additionally, results 
tend to be more easily replicated by other investigators: another researcher can 
duphcate an experment or analysis using identical procedures and the same or even 


similar data, allowing independent verification of findings. 


Quantitative analysis procedures yield other benefits that may be less obvious. 
In traditional archacological research strategies the researcher only has access to( 4) 
the raw phenomena (objects, entities, individuals) under investigation; (6) data 
(observations or measurements) pertaining to those phenomena; and (c) relation- 
ships subjectively perceived between and among the data or phenomena. The 
researcher who has knowledge of and access to empirical data analysis methods, on 
the other hand, can greatly augment these most fundamental! capabilities because 
these procedures yield additional information im the form of (4) descriptive and 
summary statistics, which describe and generalize tendencies and patterns im the 
data and make relationships explicit; (¢) complex data models, which portray the 
raw data in different ways, often illustrating or summarizing the essence of multiple 
empirical patterns; and (f) unforeseen (multivariate) relationships between classes 
of phenomena. Thus, the practicing scientist who makes use of empirical analysis 
procedures can gre tly increase his or her abilities to postulate a pattern among the 
basic facts of the discipline, an important basis for theory formulation. 


Quantitative methods of analysis are also beneficial in other domains of 
research. In classic deductive research approaches, certain predictions often are 
made based on the premises of the initial hypotheses. In the hard scsences these 
predictions usually rest on mathematical deductions or established physical laws 
such that the predictions mast mathematically (or by law) follow from the hy- 
potheses. In archaeology, which lacks a base of laws or theory, our bridging 
arguments that lead to predictions, as Thomas (1979) notes, are “seat-of-the- 
pants” kinds of statements. Well-established relationships of a statistical kind 
might be used here as an alternative or supplement to such arguments when 
predictions are formulated from theory. Finally, the methods of statistical hypothe- 
sis testing are particularly well suited as a means of verifying (or refuting) hy- 
potheses. A myriad of testing and validation procedures exists for virtually any type 
of problem context. Hence, the quantitative investigator is armed with more tools 
and capabilities for conducting basic research and for potentially interpreting and 


explaining archaeological phenomena. 
Previous sections generally presented only the basic statistical facts because 
their goal was to describe the procedures used in modeling. In this section, the 
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interpretation and explanation of these facts are briefly considered. The Glade Park 
descriptive statustscs in Table 8.1 suggested a number of empmncal tendencies. For 
example, the sites exhibited tendencies to be located in proximity to permanent 
and nearest water sources (when contrasted to the background nonsites) and also 
tended to be located with good views of surrounding terrain and close to points of 
vantage. The Glade Park sites were distributed with a north-facing preference, on 
level ground, and on high points or mesa edges in regions of limited local relet. 
Traditionally, explanation of empirical patterns such as these have generally 
assumed human selectivity (¢.g., Findlow 1980; Green 1973; Kvamme 1985a; Lafierty 
1981; Parker 1985; Roper 1979b). For example, using the above evidence we might 
argue that sites tend to be close to water because the Glade Park region was ard, 
forcing people to carry out most of thei activities mm areas near water courses. The 
andity argument might also explam the strong tendency for north-facing aspects 
since these locations would tend to mmcrease shelter from sunshine during the hort 
summer if, indeed, the sites were occupied during the summer. Locational tenden- 
cies toward good views, vantages, and high points might be interpreted as resulting 
from the need to watch for game, since the inhabitants of Glade Park were primarily 
hunter-gatherers. Such arguments, although plausible, need to be substantiated 
with additional evidence before they are taken seriously. Alternative explanations 
might also be possibie. 


The actual equations of the empirical models based on characteristics of 
locations also have interpretive potential. The discriminant analysis discussed im 
the “Apphcation Comparison™ sectyon yielded the following model: 


D, = -5.7058 - 0.0047 (aspect ) - 0.08 (slope) + 0.0152 (rehet, 100 m) - 0.005 (rehet, 300 m) 
+ 0.001 (shelter index) - 0.0002 (vantage distance) - 0.001 (distance to nearest 
drainage) - 0.0008 (distance to second-order drainage) 


Positive coeflicrents associated with a vanable suggest that high values of the 
variable are related to site presence, while negative coefficients suggest that low 
values of the variables are related to site presence. Hence, this model indicates that 
high values of relhef (within 100 m) and the shelter index and low values of aspect 
(1.¢., north-facing), slope, rehef (within 300 m), vantage distance, and distance to 
nearest and second-order drainages are suggested by the data to be related to the 
site-present locations. 


It 1s possible to go beyond thas level of interpretation when the independent 
variables are measured in the same units and are uncorrelated. One way to acquire 
variables measured in the same units 1s to standardize the data. Parker (1985) utilizes 
this tactic to interpret logistic regression site location models in Arkansas. In the 
above nonstandardized discriminant analy sis model, several variables are measured 
in the same units. We might compare the absolute values of the associated coeffi- 
cients of these vanables to assess the relative importance of the variables. For 
example, the distance variables are all measured in meters; if we compare them we 
find that the data suggest that distance to nearest vantage (with an absolute 
coefficient of 0.002) has about one-fourth as much influence as distance to second- 
order drainage (0.0008) and one-fifth as much influence as distance to nearest 
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drainage (0.001). Distance to nearest drainage 1s slightly more umportant to site- 
location placement than distance to second-order drammage. In the present case, 
however, these variables are positively correlated, and such interpretation should 
be made with some caution. 


One way to remedy the correlation problem 1s to perform a principal compo- 
nents analysis (described carher) on the variables. This procedure yields linear 
combinations of the correlated variables, and a model can then be built that uses the 
components rather than the raw vanables as predictors (Schowengerdt 
1983: 159-167). Since the resultant components will be uncorrelated, model interpre- 
tation can be facslitated in this manner. Although this approach has certain merits, 
it often us the case that interpretation of the components themselves 1s quite 
difficult. 


Explanation of any facts pertaining to archacological distributions, whether 
raw tacts or higher-order statistical generalizations, may take a number of forms. A 
good approach might be to treat cach possible explanation as an alternative 
hypothesis. Possible alternative hypotheses for an observed relationship between 
archacological sites mn a region and some environmental feature might include (4) 
human selectivity, (6) geologic processes, (<) vegetation patterns, and (4) sampling 
biases. 


To dlustrate this multuple-hypothesis approach, recall that the models dis- 
cussed m the section on “Application Comparison of Quantitative Locationai 
Models” all indicate that the locations of open-air hthic scatters tend to occur in 
close proximity to second-order (or greater) drainages (Figure 8.7). In Chapter 10, a 
histogram of this variable measured at all 230,000 land parcels (50 by 50 m units) in 
the study region 1s compared with 2 histogram of the same variable measured only 
at the nearly 600 parcels with sites mm the area (Figure 10.11). These histograms 
clearly support the suggested pattern; for example, half of the sites occur within 150 
m of drainages of these ranks, while only 17 percent of the study region hes within 
this distance of such dramages. 


The explanation of this pattern that probably comes to mind first 1s that of 
human selectivity: the prehistoric mbhabitants purposefully placed ther sites in 
proximity to relatively secure sources of water in order to obtain water more casily. 
Various sources of ethnographic evidence and the andity of the southern Colorado 
plains could be argued to be supportive of this hypothesis. An obvious and related 
alternative hypothesis 1s that the mbhabitants tended to locate activity close to 
drainages not for the water but for some other related resource. For example, they 
might have been exploiting plants that tend to be found near water, or they might 
have chosen stream-associated locations in order to hunt a variety of game animals, 
such as bison, that might be drawn to water. This is a common case, where one 
variable (proximity to water) might be only a proxy for some other variable 
(availability of plant foods or prey animals) that actually was important. Supporting 
data for this competing hypothesis would be hard to obtain. Such cata might 
include appropriate floral and faunal remains in a suitable archacological association 
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from sites both close to and far from the drainages. In addition, the sampie of sites 
would need to be large cnough to 311d statustically rehable conclusions. 


A third hypothesis might be that the abserved pattern 1s a result of geological 
processes or vegetatson patterns that have bumed or hidden sites located great 
distances from these dramages and exposed sites lying m proxmmuity to the dram- 
ages. In the present case, a geomorphological study of the region (Schuldenrem 
1983) found the reverse to be truc; the primary areas of alluviation were along majo: 
dramages. Additionally, vegetation m this reguon (which affects sate vesebslty) os 
densest along mayor dramages and very ght or nearly absent far from dramages 
(Van Ness 1984). 

Finally, a fourth explanation of the pattern might be that i os the result of 
sample selection bias. Since a random sampling design was employed for sit: 
discovery (based on randomly placed transects), and assuring the trustworthiness 
of the survey crews and uniformity of thew procedures, this hypothes:s seems an 
unhkely candidate. 


Certamly there are other alternative hypotheses for expiaming the observed 
relatsonshup. In this case, as mn all cases snvolving hypothesis testing, the alternative 
for which the greatest amount of supporting evidence can be obtained should be 
advanced as the most hkely explanation. It 1s also quite possible that several of the 
hypotheses could be truc. 


ASSESSING MODEL PERFORMANCE 


In previous sectsons mutial or “apparent” accuracy rates were presented for 
several models. Apparent accuracy rates were obtained by applying a model to tne 
same data used to generate the model. This practice, as noted mm those discussio 1s, 
tends to give an inflated view of true model accuracy and underrepresent true 
model error rates. The purpose of this section ts to examine methods that can yield 
truer indications of actual model performance and to offer statistical significance 
tests of model performance. It is emphasived that regardless of how a model 1s 
developed trom theoretical expectations or from empirical data—most of the 
following methods for testing apply. These methods should be used to validate the 


performance of aey model prior to its apphcation te monagement of research 
problems. 


Adjustable Accuracy Rates 


Site location models discussed im previous sections were designated as hav ng 
classified a percentage of sites correctly and a percentage of nonsites correctly . Some 
of the models classified only about 70 percent of the sites correctly (and some hada 
lower rate than this), which might not be very useful from a practical standpornt; 
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the 70 percent correct figure means that 30 percent or more of the sates were 
meoorrectly classified. Thas 1s fairly costly gwen the nature of the resource and our 
geal of developing models that have some potental for real-world apphcaton. One 
way to resolve this problem might be to obtain better data or to make operational 
new vatiables that would yield stronger models, but eather soluton could entail 
addstsonal cost and effort. Even d such models were developed, some sites and some 
nonsites will always be mcorrectly classified by a model, and the accuracy rates 
mught be less than desurable or be unacceptable for practical apphcations. 


A solution to this problem ss to accept a trade-off, to exchange mcreased 
accuracy om classifying sites for decreased accuracy m classifying nonsites sence mt 
costs us less tocall a nonsste a ste compared wath the rewerse (using the terminology 
mt roduced mm Chapter 3, we decrease gross error by increasing wastetul error). The 
decison rule used for examening the intial apparent accuracy rates of all previous 
models (¢.g., Tables 8.2, 8.4, and 8.5) was a maxemum-hkebhood rul-; locations were 
assigned to the class (site or nonsute) to which they were most smular. For several of 
the models this amounted to assigning a locatson to the site class based on a cutoff 
point of p = 0.5. In order to trade nonsite classification accuracy for mcreased site 
accuracy, we need only change this p-value to a lower cutotl —tor example, to p « 
0.25. In terms of the measurement space (Figure 8.5b), this change moves the 
decison boundary upward, causing more of the sites to be correctly classified (but 
causing more nonsites to be incorrectly classified). The logical extreme for this 
tactsc would be to choose a cutoff of p = 0.0, which would cause the entire 
measurement space to be classified to the site group (but this would create the 
absolutely accurate but useless predictor of site locations described m an carher 
section ). 


We can use the Glade Park nine-vanable site location model presented m the 
“Example Analysis” section as an dlustration of thes procedure. This model us based 
on a sample of 157 known ste and 157 known nonsite locations, obtained through a 
cluster sample of 38 quarter-section quadrats. The Glade Park model was a>phed to 
estimate site-group p-values for cach of these 314 locations based on thew nine 
environmental measurements. Histograms of these p-values are given im Figure 
8.1 1a. If we use the traditional cutoff pot (p = 0.5), 70.1 percent of the sites fall 
above this pot in the site histogram, while 66.2 percent of nonsites fall below this 
cutoff im the nonsite histogram. (It 1s this process that yields the accuracy rate 
predictions given in the two-by-two matrix mn Table 8.2b.) If we were to use a lower 
cutoff, wt 1s readily apparent from Figure &.11a that more sites would be classified 
correctly and more nonsites incorrectly. This effect » summarized in Table 8.6 
using cutoff p-values of 0.0, 0.1, 0.2, . . ., 0.9, 1.0, and it us graphed in Figure 8.11b. Of 
course, when cutoff probabilities of p = 0.0 and p = 1.0 are used, erery location 1s 
classified either as a site of as a nonsite, respectively, and we have a zero-gain 
predictive model. On the other hand, at a cutoff of p = 0.2, 96.2 percent of the sites 
and 26.1 percent of the nonsites are correctly classitied; at p = 0.4, 82.8 percent of the 
sites and 52.9 percent of the nonsites are correctly assigned, etc. 
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Although application of a model to the same data used to build the model can 
yield inflated performance indications, the researcher can use these data, together 
with the adjustable rate method, to design a model that performs at an approximate 
level of accuracy. For example, using the information given in Table 8.6 and 
graphed in Figure 8.11b, the researcher can contrive a model that apparently 
predicts approximately 90 percent of the sites correctly by selecting a cutoff value of 
p = 0.3. At this cutoff point approximately 38 percent of the nonsites would also be 
correctly assigned. (Of course, these figures are undoubtedly inflated to some 
extent because the Glade Park model has not yet been tested with independent 
data. The same procedures apply after testing, however, and it 1s shown below that 
very similar results are obtained.) 


TABLE 8.6. 
Illustration of changing cutoff p-values and their effects on site and nonsite classification 
accuracy using the nine-variable Glade Park model data 





Correct Predwtioms Incorrect Predsctions 
Sate Nomitte Site Nomute 


Cutoff Powmt Number Percent = Number — Percent Number — Percent Number — Percent 





0.0 157 100.0 0 6.0 0 0.0 157 106.0 
0.1 157 106.0 2) 134 0 0.0 13% 86.6 
0.2 15] 96.2 4! 26.1 6 3.8 116 73.9 
0.3 142 9.5 @ 38 2 15 9.5 97 61.8 
0.4 130 82.8 82 52.9 27 17.2 75 478 
0.5 110 70.1 104 6.2 47 2.9 53 33.8 
a4 87 55.4 126 80.3 70 46 31 19.7 
0.7 5k %6.9 i“4 91.7 ed 63.1 13 8.3 
0.8 33 21.0 157 100.0 124 79.0 0 0.0 
0.9 12 76 157 100.0 145 924 0 0.0 
1.0 0 0.0 157 100.0 157 100.0 0 0.0 





Clearly, the actual number of iocations (¢.g., small-area units, such as acres) in 
a region of study that are nonsites is usually far greater than the number that are 
sites—on the order of 100 nonsites for every site. (This is usually referred to as the a 
priori or base rate probability problem and will be discussed below.) Thus, our claim 
that 38 percent of the nonsites are correctly assigned by a model essentially means 
that nearly 38 percent of the area of the study region as a whole is unlikely to contain 
sites (in this example, if tive 28 percent area were mapped, it would only contain 
about 10 percent of all sites). If the study area is extensive this could amount to a 
sizable area that is largely devoid of sites. Thus, another important function of the 
nonsite control group is to provide area estimates about model performance. In 
other words, the nonsites provide data concerning the estimated area of a model at a 
particular cutoff point when mapped. If the data in Table 8.6 are correct indications 
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where the summation 1s from é=1,.. .. g intervals. 


The site location model used to illustrate this test for goodness of fit is the 
initial nine-variable Glade Park logistic regression model given in an earlier section 
of this chapter. Results of apply ing this model to the same data that were used to 
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of the Glade Park model’s accuracy, we could claim that 90.5 percent of the sites 
should occur in only about 100 - 38.2 = 61.8 percent of the total Glade Park land area, 
which could be called a “high site-sensitivity zone,” and that 9.5 percent of the sites 
should occur in the other 38.2 percent of the land area, which constitutes a “low 
site-sensitivity zone.’ Computer mapping techniques shown in the previous sec- 
tions and described in detail in Chapter 10 may be used to provide maps of these 


sensitivity areas. 


This cutoff adjustment approach is not necessarily restricted to models that 
yield a 0-1 scale of estimated probabilities. The simple mathematical models 
discussed in previous sections can also be examined in this context. The perfor- 
mance of minimum distance models might be assessed by investigating accuracy 
rates at various cutoff ratios of distance to class centroids, for example. Similarly, 
performance statistics from a number of “slices” might be examined in a level-slice 


approach. 


From the foregoing it should be apparent that the use of overall accuracy rates 
(1.e., the combined site and nonsite accuracy) to evaluate the performance of a site 
location model, a fairly common pr_-tice (e.g., Berry 1984), is not only misleading 
but inappropriate. To illustrate, suppose that a sample survey from a large region 
discovers 100 site locations. It is possible to obtain virtually any sample size of 
nonsites as a control group since it is not uncommon for 99 percent of many study 
regions to be classifiable as “‘site absent.”’ Let us say that 9900 locations are chosen 
for the nonsite control group, for a total sample size of 10,000. (Although this may 
seem to be aludicrously large number, such sample sizes are possible through use of 
computer data bases, as Chapter 10 will show.) If all 10,000 locations were arbitrarily 
classed as nonsites, an impressive overall accuracy rate of 99 percent would be 
achieved (100 x [total correct] | [total cases] = 100 x [9900 + 0]/[10,000]), but the 
resulting model would be useless. It is clear that performance must be judged by 
focusing on percent correct rates for sites and nonsites individually. 


Model Validation Procedures 


The nine-variable Glade Park model has already been used to illustrate 
performance adjustment; in this section it will be used to demonstrate several 
model validation techniques. We can assume that the apparent performance rate 
statistics given in Figure 8.11b and Table 8.6 are inflated, but to an unknown 
degree. In other words, when this model is applied to other locations im the study 
region, actual accuracy rates may be lower than those indicated in the figure and 
table. The inflated performance statistics result from a number of factors. Primary 
among these is that the same data were used to build the model and to estimate the 
percent correct prediction rate (Table 8.6). Since the Glade Park model is based on 
differences between site and nonsite locations in that specific sample, the statistical 
procedures capitalize on variation in that sample such that apparent accuracy rates 
are maximized (Swain 1978:163). Violations of statistical assumptions, such as the 
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independence assur iption that results from spatial autocorrelation (see above and 
Chapter 5), further widen the difference between apparent and actual accuracy 
rates, particularly in this cluster-sampled context (Basu and Odell 1974; Campbell 
1981; Tubbs and Coberly 1978). 


Randomization procedures for assessing the upward classification bias that 
results when a model ts applied to the same data with which it was generated have 
been developed by Frank et al. (1965). These procedures were applied by Berry 
(1984) in a paper that generally attempted to discredit certain cultural resource 
modeling approaches. The procedures of Frank et al. (1965) and the findings of 
Berry (1984) are both germane to this discussion. One randomization procedure 
requires generating random normal deviates as a “synthetic” validation sample, 
and then developing a model based on these random data. The resultant classifica- 
tion accuracy, when the model is applied to the synthetic data set, reflects upward 
bias attributable to the procedure itself, since the robust properties of many 
multivariate classification models can cause a better-than-chance fit even to random 
data. Berry (1984) points out that in two such simulations by Frank et al. (1965), 
which used the discriminant analysis model, average overall classification rates of 
68.2 and 72.6 percent were achieved, which would seem to reflect poorly on 
discriminant analysis efforts in general, including those in archaeology. Berry does 
not mention, however, that one simulation used 19 predictor variables with 150 
cases and the other used 25 variables with only 98 cases (Frank et al. 1965:256). The 
large number of variables relative to the number of cases is an example of what can 
be called byperfitting of a model to the data. It is possible, through use of large 
numbers of predictor variables, to obtain very strong fits regardless of the degree of 
patterning in the data (using #-1 predictor variables in a discriminant analysis 
guarantees a perfect classification, for example). This property is demonstrated in 
Berry's Table 2. Using random data and 30 cases, Berry shows through simulation 
that four variables yield an overall correct rate of 53 percent (3 percent upward bias); 
8 variables, 70 percent (20 percent upward bias); 12 variables, 77 percent (27 percent 
upward bias); and 20 variables, 93 percent (43 percent upward bias). The difficulty 
in real-world applications is to obtain a good fit with few variables relative to the 
number of cases. This randomization procedure seems useful, however, when the 
numbers of variables and cases are matched to those actually used to develop an 
archaeological model. The results could give an excellent indication of the size of 
upward bias an investigator might be facing. 


The second randomization procedure for investigating upward bias described 
by Frank et al. (1965) utilizes the actual model data for the predictor variables. In 
this case, though, the true value of the dependent variable, class membership, is 
randomized and a classification model is produced based on the randomized groups. 
The advantage of this procedure is that the actual model data are used, allowing the 
upward bias result to pertain more closely to the mode! under investigation. Berry 
(1984:849) utilizes this technique, with 10 replications, to illustrate a mean random- 
ized classification rate of 71.6 percent, suggesting that the apparent overall accuracy 
rate of 85 percent for an archaeological locational model developed for the Bureau of 
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1.0217(G,) -0.0163(G,) = 2.1972 
and 

1.0217(G,) -0.0163(G») = -2.1972 
These limits can be plotted as parallel lines in a graph by finding two points for each 
and drawing a line through them. Setting G, = 0 gives 2.15 for the upper limit and 


-2.15 for che lower limit. Setting Gy = 200 gives 5.34 for the upper limit and 1.06 for 
the lower limit. The upper and lower limit lines plotted through tt *se points are 
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Land Management is mostly attributable to upward bias. The particular model that 
Berry examined was based on six variables and 174 cases (Burgess et al. 1980). Berry 
achieved his result, however, by using the six variables p/us 10 additional ones that 
were eliminated from consideration im the orginal study owing to the lack of 
significant findings of several univariate and multivariate tests made on these 
variables. Thus, Berry achieved a mean randomization rate of 71.6 percent by 
hyperfitting 16 variables to 174 cases. I reran this randomization procedure with 10 
replications on the original six variables, which yielded overall classification rates 
ranging from 49.4 to 60.9 percent with a mean rate of 56.9 percent, an upward 
inflation of less than 2 percent (the rate expected by chance in this case is 55.6 owing 
to unequal class sample sizes; Berry 1984). These findings are more in line with the 
amount of upward bias one might expect when the number of predictor variables is 
small relative to the number of cases. 


There are a number of ways to conduct independent tests of a site location 
model's performance. In an independent test, data that are different from the informa- 
tion used to build a model are used to test the model in order to eliminate the 
possibility of model capitalization on chance sampling variation. The strongest test 
of model performance would require an additional independent survey. Site loca- 
tion models could be applied to these independent data to derive unbiased esti- 
mates of model performance accuracy rates. (Ideally, the independent survey would 
be conducted by archaeologists different * »m those who collected the data used to 
construct the imitial site location model.) in many cases it is difficult or impossible 
owing to cest constraints to conduct a second, independent survey. For this reason, 
a number of alternative procedures have been developed that attempt to provide 
independent testing information but do not require that a second survey be 
performed. Two of these procedures, which were introduced in Chapters 5 and 7, 
are split sampling and the jackkmfe method. 


Split Sampling 


Split sampling traditionally requires randomly splitting a sample of cases (sites 
and nonsites) in half, building a model with one half, and testing the model with the 
second, independent half (Mosteller and Tukey 1977:38; see Chapter 7). A problem 
with this method results from the use of cluster sampling. There is within-cluster 
spatial correlation between analysis locations so that sites and nonsites in one of the 
split groups may not necessarily represent completely independent information 
relative to sites and nonsites in the other group. 


A better split-sampling technique for cluster-sampled data requires that the 
clusters be randomly split into two groups of equal size. The model is then built with 
data from one-half of the clusters and tested with the second half, which now can be 
argued to be independent of the first half. This approach was applied to the Glade 
Park analysis data. The 38 sampling quadrats were randomly split into two groups of 
18 (two of the quadrats contained neither sites nor nonsites and are excluded here); 
models were then built using the same nine variables used in the previous Glade 








KVAMME 





Park analyses for each half, and their performance was assessed using the data from 
the other half. The classification accuracy curves for all possible model cutoffs are 
illustrated in Figure 8.12a, as are the average performances of the two models. The 
inflation in accuracy of the orginal model, which amounts to a few percentage 
points, can be seen when the performance curves in Figure 8.124 are compared with 
those in Figure 8.11b. 


A drawback of the split-sampling approach 1s that only half of the available 
information is utilized in developing a model (since the other half must be reserved 
for testing), which is a waste of costly information. One approach that utilizes all of 
the available data is discussed in the next section. 


Fackknife Methods 


The jackknife method (Lachenbruch and Mickey 1968) was developed as a 
means of providing a less biased assessment of the performance of a classification 
model while allowing all information to be used in model construction. In the 
traditional jackknife, one of the 2 cases is temporarily discarded, and the remaining 
n-1 cases are used to build a classification model. The discarded case is then 
independently classified by the model. This procedure is repeated, eliminating each 
case in turn, to establish an independent test of model performance. Thus, unlike 
split sampling where half of the cases are normally discarded, the jackknife requires 
that only one case be left out at any one time, which allows retention of most of the 
information. An additional benefit of this procedure is that the » resulting models, 
each providing a slightly different result, can be combined into a single model to 
provide a better estimated or jackknifed model (Mosteller and Tukey 1977:152). A 
model derived from # models is usually superior to the traditional model based ona 
cases because each coefficient in the combined model is based on # estimated 
coefficients from the individual models. The BMDP discriminant analysis program 
7M provides the jackknife as an option (Dixon et al. 1983). 


In an archaeological site location modeling context, where some form of cluster 
sampling is normally applied, a modified jackknife procedure can be used. This is 
necessary because, as noted in the section on split sampling, analysis locations in the 
same cluster might be spatially correlated. Testing a case against a model derived 
from the other cases in the same cluster may not yield an entirely independent 
assessment. The modified jackknife technique requires discarding all cases in one of 
the & clusters, building a model with the cases in the remaining 4-1 clusters, and 
testing the model on the data in the discarded cluster. This procedure is repeated, 
with data in each cluster in turn being reserved as the test cases, until & models have 
been developed and data in each of the clusters have been tested. 


When this jackknife method was applied to the Glade Park model data, 36 
models, each constructed by eliminating locations in a different sampling unit 
cluster, were deve ioped. In Table 8.7 the original model, L(0), based on all 314 of the 
sites and nonsites is given first, followed by the 36 models derived by leaving out the 
locations in a single cluster. The performance rates found by applying the ¢” model 
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Figure 8.12. Glade Park model performance curves: (A) spit-sampled models, (B) jackknifed performance and 
yackknufed model apphed to independent data 
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TABLE 8.7. 
Original nine-wariable Glade Park site-location model (L{ 0]), 36 jackknifed models, and final composite 
jackknifed model (L{ *]) 
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to the discarded cases of the #” cluster are given in Figure 8.12b. At the p = 0.5 cutoff 
point (Figure 8.12b) about 66 percent of the sites and 64 percent of the nonsites are 
correctly classified, compared with 70 and 66 percent, respectively, for the imuitial 
model (Figure 8.11b and Table 8.6). At the p = 0.3 cutoff the jackknifed data (Figure 
8.12b) suggest that 87 percent of the sites and 32 percent of the nonsites are 
correctly classified, in contrast to 91 and 38 percent, respectively, for the mutual 
model (Figure 8.12b). Thus, the jackknife suggests moderate decreases in model 
performance, and these rates may be taken as better estimates of “true” model 
performance rates. Similar results have been noted elsewhere (¢.g., Campbell 1981). 


The jackknifed site location model, created by taking a weighted average of 
the coefficients of the 36 individual models, is given as L(*) in the last line of Table 
8.7 (see Mosteller and Tukey 1977:152). 


Completely Independent Samples 


It was indicated above that one of the most reliable ways to test the perform- 
ance of a site location model is to apply it to data from a second, independent 
survey of random sampled data. Such data were not available in Glade Park, but the 
existing site files, which contain information on many hundreds of known sites, 
provide a large body of independent site location information. Site forms on file at 
the local BLM office were carefully screened for quality of information, particularly 
with regard to accurate locational data (see Chapter 7). A simple random sample of 
50 sites that represented a well-spread distribution of sites from throughout the 
Glade Park region was selected (see Kvamme 1983c for details). A control group of 
nonsite locations was also chosen so that model performance could be assessed. 
These nonsites were not necessarily selected from previously surveyed regions and 
thus actually represent ine “environment at large?’ (rather than true nonsites), but 
they may still be referred to as nonsites. As discussed earlier, when the prior or 
chance probability of a site is very low in a region (and in this case the site prior 
probability has been estimated to be as low as P|S| = 0.02; see below), nonsites can be 
selected at random from throughout a study region regardless of whether or not the 
locations have been field inspected. The advantage of this procedure 1s that better 
estimates of nonsite variation can be obtained than if the nonsites were restricted to 
a limited number of surveyed clusters. The disadvantage 1s that some small 
percentage (here about 2 percent) of the nonsites are misclassified because they are 
really sites. In the present case 87 nonsites were selected at points located at the 
center of each of 87 randomly selected sections throughout the region (on a chance 
basis, only one or two of them should fall on sites). 


The jackknifed site location model (last line of Table 8.7) was applied to 
measurements performed at the 50 independent site and 87 independent nonsite 
locations. The results of this test, superimposed on the jackknifed ‘esults (Figure 
8.12b), are very supportive of the performance rates determined by other means. 
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Statistical Tests 


Model classification performance indications are often unreliable owing to a 
failure to meet the various assumptions of the model used and particularly when the 
same data are used to build a model and to assess it. In making a statistical 
assessment of a model's performance, i 1s much safer to use independent test 
samples. In other spatially onented disciplines, statistical significance of model 
predictions and confidence hmuts around predictions are commonly determined 
through the use of independent test samples (¢.g., Hay 1979; Rosenfield et al. 1982; 
Schowengerdt 1983:109- 195). 


The most common performance assessment of a classification model involves 
determination of accuracy rates (percent correct statistics). The following sections 
present a significance test for model classification results and procedures for estab- 
lishing confidence limits around percent correct statistics obtained when a model 1s 
applied to independent test samples. Also presented is a graphic technique for 
assessing the goodness of fit of a model to the empirical data, which offers an 
alternative to accuracy rate statistics. Associated with this technique 1s a signifi- 
cance test that » appropriate for application to the same data set from which the 
locational model was derived. Finally, a sequential analysis approach 1s presented 
that minimizes the size of independent test samples needed to test a model by 
requiring the collection of new data only until a decision about model performance 
can be reached. 


Testing the Sigmficance of Model Classification Results 


When an archaeological locational model is applied to independent test sam- 
ples in a two-c'+ss problem (¢.g., samples of sites and nonsites), the resulting 
classification can be statistically assessed through a relatively simple chi-square test 
for differences in classification probabilities. This test assumes that (4) independent 
test samples from both classes (populations) are being used, (6) the test samples are 
random samples, (¢) the two samples are mutually independent (i.c., the locations in 
the site sample really have sites and the locations in the nonsite sample do not have 
sites), and (4) the locations can be unambiguously assigned by a model (decision 
rule) to esther of the classes. The data are arranged in a2 by 2 coniingency table, as 
shown in Table 8.82. 


A one-tailed test is most appropriate since we are testing for direction in the 
table, .¢., we are testing whether the model has some utility for making correct 
classifications. The null hypothesis states that the probability that a location 
belonging to the site-present population will be classified by the model to that 
population 1s less than or equal to the probability that a location from the site- 
absent population will be classified to the site-present class. Rejection of the null 
hypothesis imphes acceptance of the alternative-—that a location from the site- 
present population has a greater probability of being classified to that population 
than does a location from the site-absent population, indicating that the model has 
some predictive utility. The test statistic 
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(see Table 88a for explanation of symbols) is compared against the (!-2a) quantile 
of the chi-square distribution with one degree of freedom. If T exceeds that value, 
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the null hypothesis may be rejected (see Conover 1971: 141-146). 


TABLE 4.8. 


Assessment of model classification results: (a) set-up for a 2 by 2 table; (b) classification results of 


jackknifed model applied to independem Glade Park data (at p « 0.4 cutoff point); (c) goodness- 
of-fi vest data with fixed cutoff points applied to data used to establish insial Glade Park model 
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The independent Glade Park test data results from the previous section can be 
used to dlustrate application of this test. The independent sample of site locations 
was taken from existing site file information at a local BLM office (this sample was 
discussed above); because it 1s possible that survey biases might be reflected in this 
sample, in practice it would be more desirable to use sites obtained from an 
independent field survey conducted under a random sampling design. The inde- 
pendent sample of nonsite locations is actually a sample of locations taken at random 
from the background environment at large (this sample was also described above). 
Without field checking, there 1s no way of knowing for certain whether or not a 
particular location im this sample contains a site; an estumate of the base rate or a 
prion chance of a site occurring at a location in the region (see below), however, 
indicates that approximately 94-98 percent of this sample should not contain sites. 
Although the third assumption listed above technically is violated, the performance 
of the test should be modified only slightly given the low rate of site occurrence (the 
principal effect will be to make acceptance of the null hypothesis more likely 
through a reduction in the apparent significance of the model). Note that even if a 
sample of actual nonsite locations were obtained, there would always be some 
uncertainty about the absence of sites from all sample locations owing to the 
possibility of sites having been missed during survey and to the potential presence 
of buried sites. 


The independent test data indicate that at the p = 0.4 cutoff pomt approxi- 
mately 86 percent of the locations with sites and 43 percent of the locations without 
sites are correctly classified by the Glade Park jackknifed model (Figure 8.12b), 
which produces the 2 by 2 structure shown in Table 8.8b. When computed using 
these data, the test statistic yields 


137 [(43X37) - (750) 
(50X87\43 + SOX7 + 37) 





= 11.853 


At a level of significance of 0.001 the oull hypothesis will be reyected if T exceeds 
9.549 (from a table of the chi-square distribution with one degree of freedom). It is 
therefore reyected in the current case, which suggests that the model has some 
predictive utility at the p = 0.4 cutoff point. (A common complaint with contingency 
table tests in archaeology is that a significant result might be due to only one cell 
with a large deviation from expectation. In this testing context, however, if most of 
the test statistic is due to one cell it means that either the model does a better job 
than chance at classifying sites or that the model does better than chance at 
classifying nonsites. In either case we win because the model offers some gain over 
pure chance.) 


Before leaving the subject of testing model accuracy rates, it should be noted 
that a number of additional procedures currently being examined in other disci- 
plines warrant investigation by archacologists. These include specialized analysis of 
variance techniques (Rosenfield 1981) and the set of methods known as discrete 
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multivariate analysis (Congalton et al. 1983). Both of these approaches offer the 
potential for significance testing of individual and overall classification results im 


tables larger than 2 by 2, making them suitable for multiclass modeling problems 
(e.g., models for multiple sate-type classes). 








Establishing Confidence Limits Around Accuracy Probabilities 


Since the classification of a test location by a model or decision rule us ether 
nght or wrong (i.c., a site or nonsite location 1s correctly identified or it 15 not), the 
correctness of a classification assignment at cach location represents a binomial 
population. The Glade Park independent data test results (Table 8.8b) indicate that 
86 percent of the site locations and 43 percent of the nonsite locations should be 
correctly classified by the jackknifed model (at the 0.40 cutoff). These percent 
correct statistics, which represent estumated mean probabilities of correct classifica- 
tion (when divided by 100), can be considered random vanables with a binomial 
probability distribution. Associated levels of statistical error can be found m tables 
or graphs of confidence limits of the mean of a binomual distribution (¢.g., Conover 
1971:380-381; Hord and Brooner 1976). Hord and Brooner (1976:672) give the 
following as the approximate 100(1 - a) percent confidence interval for p, the 
proportion of successes, given # trials. 


P 2 
(4 :) = %, a VXip)* (2, 9) 
» , 


(2. ? n) 











When the Glade Park results for the site class are used, the proportion of sites 
correctly classified by the model 1s p = 0.86 and # = 50. For a9 percent confidence 
interval, a table of the normal distribution (found im any statistical text) gives tq 2° 
2,025 * 1.96. The limits of the 9 percent confidence interval become 








~ oe, 7) L0-80XI = 0.86) + 1.96% (4/50) 

















0.86 + . 
Pines e 0.74 
1 + (1.96? 50) 
L ise) (0.86) 1 - 0.86) + 1.96 (4[50]) 
0.86 + + 1% 
2(50) 30 
P upper . s 0.93 
1 + (1.96? 50) 


or P(0.74S pS 0.93) = 0.95. Sumular calculations for the nonsite class (withp = 0.44,0 © 
87) yreld a 9 percent confidence mterval of 0.34 S pS 0.54. 
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In theory, such confidence mtervals indicated that, B percent of the ume, 
independent test s.mples should yield proportion correct statistics (p) between 
these lamuts. In other words, of we had numerous independent test samples of known 
sve-present locations (and = = 50), mm about % percent of those samples the 
proportion of sites correctly classsfied by the model would be between 0.74 and 0.93. 
The range produced by these lumuits thus gives a more realistic sdea of truce model 
performance. 

The width of a confidence terval at a given level of significance us a direct 
function of the size of the sample used to compute the mterval. Hence, mt 1s 
important to obtam large test sampies m order to produce narrower confidence 
huts. To dlustrate, df we mcrease « to 190 for the above site-class percent 
confidence mterval (leaving p = 0.86), we obtain 0.80 S p< 0.91. Increasing # to 300 
gives 0.82 S pg S 0.89. Upper and lower confidence imu values can be mnserted mto 
other formulas (¢.g., the gasn statistic or those shown in the base rate probabilities 
section, below) to assess upper and lower bounds on other dimensions of model 
performance. Confidence mtervals are not restricted to 2 by 2 tables but may be 
apphed to results obtained from tables of any size (¢.g., in problems with multiple 
site types). Parker (1985) dlustrates use of the Powson distribution when estumated 


mean archacological probabilities are extremely low (¢.g., 7 = 0.05). 
Assesang Model Goodness of Fit 


Parker (1985) presents an alternative approach for assessing archacological 
model performance that does not focus on percent correct statistics but compares 


observed with predicted probabilities of site presence. In this approach, which 
yields a graphical result, a probability scale (i.c., a scale ranging from 0 to |) of site 
presen... 18 divided arbitrarily into multiple groups or intervals (¢.g., 0S p S 0.02; 
02S pS 0.06; 0.06 S p S 0.10, etc, Parker 1985:192). Using predicted site 
probability values estumated at sample locations by a logistic regression model, the 
number of known sites and the number of known nonsites that fall im cach mterval ss 
determined, and the proportion of the total number of locations that are sites us 
calculated. This proportion us taken as an estimate of the obverred probability of site 
presence wm cach mterval. Expected probabilines for cach mterval are calculated 
simply as the group midpoint value (¢.g., the midpoint of the mterval 0.02 S pS 
0.06 1s 0.04). The observed and expected pairs for each interval are then plotted on a 
gtaph that can be used to assess model goodness of fit. If the plotted points follow a 
line with an intercept of 0 and a slope of | (a 45° angle), the model offers a good fit 
(Parker 1985:190- 192). 


A problem with Parker's method is that it is largely subjective; goodness of fit 
must be determined through a visual assessment of how well the observed and 


expected values follow a straight line. There is no associated significance test. 
Moreover, the specific group mtervals used in Parker's application were of varying 
width and were apparently formed during analysis to maximize agreement between 
observed and expected values. This tact may have been necessary, however, 
owing to the extremely small sample size (30 sites) under investigation. 
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Medical t_searchers have developed a remarkably semilar approach for assess- 
ing the goodness of fit of predactiwe models based on logustac regresmon. In the 
approach, though, the grouped probability mtervals are specified prior to the 
analysts, allowing more objective results, and an associated mgnuficance test 
available. Of premary umportance for archacological purposes us that the test may 
appropnately be apphed to the data set from whach a model was derived, forming an 
mmportant tool for screening out useless models pror to further testing. This 
approach to goodness-of-fit assessment also utshzes the probability of site presence 
estumated for a location by a statustscal model, g. Intervals of equal width are formed, 
e.g-,0-0.1,0.1-0.2, .. .,0.9-1.0, and locations (cases) are asmgned to the intervals on 
the bans ofp. If the model has predactive utility, then the 2 for locations with sates 
shou'd fall into the upper mntervals. The observed number of locations with sites (0, ) 
ss Compared with an expected number of locations with sites (¢,) for cach wmterval. 
The latter ts usually calculated as the sum of the estumated p-values for all locatsons 
m a particular imterval. More exphcutly, 


os * LH 
ae 


th lh 
wet 


whereé=i,.. .,¢imtervals; 04 is the observed number of sites in the #” interval, 44 8 
the expected number of sites in the é” interval, ¢ denotes that the « case is a 
member of the #” interval, and », is coded | for sites and 0 for nu) es. In other 
words, for a particular #” interval (¢.g., 0.8-0.9), 04 represents the observed count of 
sites having site-class p-values that fall wm that mrerval; 4 1 semply the sum of the 
site-class p-values for all locations, site and nonsite, that wm that mterval 







(Lemeshow and Hosmer 1982). As m Parker's (1985) , the ¢ pairs of 
observed and expected values may be plotted, allow ~¢ a + five assessment of 
goodness of fit when compared with a line with an interce fH"! 0 and a slope of | 


(Brand et al. 1976). 


The comparison of observed and expected site frequency: has been developed 
into a statistical test for goodness of fit by Lemeshow and Posmer (1982). Since 
considerable information 1s lost when only the site group « conudered, a more 
powerful testing procedure 1s made possible by considering obeerved and expected 
frequencies for site and nonsite classes semultancously. Owed and expected 
frequencies for the nonsite group are calculated as follows. .« 


Oah * & (1-H) 
wt 


tab = & (1 ~ pi) 
wt 


where og) 1s the observed number of nonsites in the #” mrern a 
expected number of nonsites in the #” interval. The statistic des 
show and Hosmer (1982-97) ws 
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where the summation 1s from é = I, . . ., g intervals. 


The site location model used to illustrate this test for goodness of fit is the 
initial nine-variable Glade Park logistic regression model given in an earlier section 
of this chapter. Results of applying this model to the same data that were used to 
construct the model are tabulated in Table 8.2. A requirement of the test 1s that the 
number of intervals (g) should be greater than j+1, where j is the number of 
predictor variables used by the model (Lemeshow and Hosmer 1982:96). In the 
present case, j + 1 = 10; hence, 12 intervals, with constant widths of 0.0833, are used. 
The observed and expected frequencies tabulated for the site and nonsite groups in 
Table 8.8c are used to calculate 


Hy = (0-0.61)?/0.61 + (4-2.32)*/2.32 + (5-3.90)?/3.90 + (8-8.54)? 8.54 

+ (14-11.35)?/11.35 + (16-16.58)?/16.58 + (16-19.08)? 19.08 

+ (23-24.46)? 24.46 + (24-25.36)? 25.36 + (19-19.58)?/ 19.58 

+ (19-16.62)? 16.62 + (9-8.60)?/8.60 + (18-17.39)?/17.39 

+ (15-16.68)?/ 16.68 + (14-15. 10)?/15.10 + (21-20.47) 20.47 

+ (16-18.65)? 18.65 + (20-19.42)?/19.42 + (19-15.92)?/15.92 

+ (16-14.54)? 14.54 + (12-10.64)? 10.64 + (6-5.42)? 5.42 

+ (0-2.38)?/2.38 + (0-0.40)? 0.40 

= 8.28 


The distribution of this statistic is approximated by a chi-square distribution with 
g-2= 10 degrees of freedom. At a level of significance of a = 0.05 the null hypothesis 
of a good fit can be rejected if Hy exceeds 18.31. Since H; is smaller than that value, 
we can accept the null hypothesis. In fact, the null hypothesis could be accepted at 
o = 0.5. 


A similar goodness-of-fit test is presented by Costanzo et al. (1982). This test 
focuses on residuals rather than predicted probabilities. 


Sequential Methods 


An approach to model testing that potentially requires smaller test samples 
was presented in an archaeological study by Limp and Lafferty (1981:226-229). The 
approach utilizes a sequential probability ratio test or SPRT (Dixon and Massey 
1957: 304-310; Wetherill 1975). The SPRT requires the collection of new sample data, 
but only until a decision about a model’s performance can be reached. That is, the 
sequential method does not require the collection of more observations than are 
necessary to make a decision. This approach can be beneficial for model testing 
since it offers the potential for reduced amounts of additional survey and, therefore, 
lower costs. 
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The SPRT allows a decision between two simple hypotheses. Suppose there 1s 
interest in the parameter @, the true site density in a low site probability stratum 
established by an archacological model. We wish to test the null hypothesis that the 
true site density equals some specified level, @ = Op, against the alternative 
hypothesis that the true site density equals some other specified level, 6 = 6). The 
SPRT decides in favor of ether @9 or © on the basis of sample observations. If @, == 
true, we would like to decide in its favor with a probability of l-a or greater; if @) is 
true, we would like to decide for 6; with a probability of 1-8 or greater. 


To illustrate, a predictive archaeological modei developed in southern Arkan- 
sas yielded a low site probability stratum that was mapped throughout the entire 
region of study (Limp and Lafferty 1981). The unit of analysis was a 4 ha grid unit 
(1.c., a land parcel 200 m on a side); the entire region was gridded into more than 3000 
such units. Based on the sample data used to establish the model it was estimated 
that in the low probability stratum the proportion of all grid units with sites was 
only 0.009. Limp and Lafferty (198 1:227) were willing to accept the model if the true 
proportion (@) of units with sites in the low probabilicy stratum really was 0.009 or 
less. They therefore established an SPRT to test with independent data the null 
hypothesis that the true site proportion is @ = 0.009 (setting ine probability of 
falsely reyecting the null hypothesis at a = 0.10). Their alternative hypothesis was 
that the true portion of units with sites was 8) = 0.025, an arbitrary proportion that 
they deemed would yield an unacceptably high number of site-present grid units in 
the low probability stratum. (They set the probability of falsely accepting the null 
hypothesis, i.¢., accepting @9 when 0 is really truce, at B = 0.10.) Thus, their 
sequential test was established in order to decide whether to accept Og = 0.009 (or 
less) or an alternative, 8; = 0.025 (or greater), as the true site proportion. 


The SPRT requires that observations (grid units) be made, by random selec- 
tion, one at a time. After each observation, one of three decisions is made: (a) 
accept the null hypothesis (Qo = 0.009), (6) reyect the null hypothesis by accepting 
the alternate hypothesis (@; = 0.025), or (c) make an additional observation. The 
test offers an easy-to-use graphic counterpart established by the following formu- 
las. An upper limit is given by 

(G; ¥a(@ 1/00) + (Gu Wal(1-01) (1-89) = An[(1-B) a] 
and a lower limit by 
(G, a8 1/00) + (Ga¥al(1-81)/(1-80)| = [8 (1-a)| 


where G, is the number of grid units currently inspected with sites and Gy is the 
number of grid units with no sites. Inserting values defined above yields 


(G, ¥n(0.075/0.009) + (Gy ¥n{(1-0.025)/(1-0.009)] = /n](1-0.10)/0. 10] 
and 
(G, ¥n(0.025/0.009) + (Gq \/n|(1-0.025)/(1-0.009)] = Inf0.10/(1-0.10)] 


yielding, after simplification, the following respective upper and lower limit equa- 
tions: 
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1.0217(G,) -0.0163(G,) = 2.1972 
and 
1.0217(G,) -0.0163(G,) = -2.1972 


These lim:ts can be plotted as parallel lines in a graph by finding two points for each 
and drawing a line through them. Setting G, = 0 gives 2.15 for the upper limit and 
-2.15 for che lower limit. Setting G, = 200 gives 5.34 for the upper limit and 1.06 for 
the lower limit. The upper and lower limit lines plotted through tt =se points are 


graphed in Figure 8.13. 


In a model testing context, graphs such as Figure 8.13 are established prior to 
testing. During the survey, the result for each test observation (grid unit) is plotted 
by drawing a line one unit to the right if the observation does not contain a site and 
one unit upward if the observation contains a site. Sampling is continued until the 
plotted line crosses the upper or lower limut, at which time a decision is reached 
concerning the acceptance or rejection of the model. If the true proportion of units 
with sites 1s exactly equal to Oo, then the null hypothesis will be accepted 
approximately 100(1-a) percent of the time (upon repeated testing trials); if the 
true proportion of units with sites is exactly equal to @), then the null hypothesis 
will be accepted about 100(8) percent of the time; if the true proportion of units 
with sites 1s between Og and 0}, then the null hypothesis will be ac epted between 
100(1-a’) and 100(8) percent of the time, the percentage of acceptance decreasing 
progressively from 100(1!-a) to 100(8) as the true proportion increases from 0g to 
6}. 





Reject Mode/ 


2 - Continue to Sample 
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~ 
0 100 200 
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Figure 8.13. Sequential sampling design for a southern Arkansas study. For a given number of units 
surveyed, if the number of sites encountered exceeds the upper limit, the site density expected by a predictive 
model 1s exceeded and the model may be rejected (after Limp and Lafferty 1981:227). 
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in Limp and Lafferty’s (1981) application (Figure 8.13), it is readily apparent 
that if the first 135 grid units sampled did not contain sites, the model could be 
accepted immediately (i.e., @ = Og = 0.009). On the other hand, discovery of only 
three units with sites for sample sizes under 53 would be cause for immediate 
rejection of the model and acceptance of 8 = 6) = 0.025. The space between the 
acceptance and rejection regions represents an “inconclusive” range where neither 


decision can be reached. 


Formulas for estimating the average sample size needed to arrive at a decision 
are given by Dixon and Massey (1957:309-310). Application of these formulas to the 
Limp and Lafferty data yields + = 253 when the true proportion of sites is Og = 0.009; 
n= 183 when the true proportion is | = 0.025; and ” = 290 when the true proportion 
is between Qo and 0 (this latter figure is approximately the maximum average 
sample size). 


It should be emphasized that the Limp and Lafferty (1981) example illustrates a 
rather extreme application of sequential methods because they focused on such low 
probabilities (1.¢., @9 = 0.009). In any statistical procedure dealing with rates and 
proportions, a number of problems arise when estimated probabilities are very high 
or very low. First, since the estimated probabilities are based on relative frequencies 
derived from empirical data, very large samples are needed for reliable estimates 
when the relative frequencies are extreme (¢.g., less than 0.05 and greater than 
0.95). Limp and Lafferty (1981) derived 09 = 0.009 by finding two sites in only 235 
units in their snitial sample. A change of only two sites in either direction would 
have caused @g to range between 0 and 0.017, substantially altering the structure of 
the sequential test given above or even preventing its use (in the @g = 0 case). A 
sample size of several thousand would be needed for a reliable estimate of Op = 0.009. 
Second, extreme estimated probabilities in a sequential test require that large 
samp!es be examined before a decision regarding acceptance or reyection of Gg can 
be made. To illustrate, if a more reasonable low probability stratum that contained 
approximately 20 percent of all sites had been defined, then 0g = 0.2. Suppose that a 
determination had been made that this stratum could acceptably contain as many as 
30 percent of all sites; then @; = 0.3. The average sample size needed to arrive at a 
decision (leaving a = 8 =0.1) would be » = 69 when the true proportion 1s @p = 0.2; 2 = 
63 when the true proportion is @; = 0.3; and « = 90 when the true proportion 1s 
between 69 and 0) (compare n = 253, n = 183, and m = 290, respectively, for the Limp 
and Lafferty application above). 


Several important assumptions and technical difficulties behind the sequential 
method limit its practical use. Sequential methods assume complete randomization 
of sampling units. After each unit is inspected a new decision is made; therefore, the 
next unit must be chosen at random. This prohibits the typical practice of selecting 
clusters of units located near one another for each day’s work in order to minimize 
travel. Each unit must be selected at random, and the units must be inspected in 
random order. This requirement necessarily causes increased effort to be expended 
in travel to sampling units. This difficulty may be reduced to some extent by 
selecting sampling units in groups (¢.g., groups of 10) rather than individually; this 














BEST COPY AVAILABLE 


DEVELOPMED'T AND TESTING OF QUANTITATIVE MODELS 


Kvamme, Kenneth L., and Michael A. Jochim 
1988 Environmental Basis of Mesolithic Settlement. In The Mevolithn in Ewrope: Papers Prevented at the 
T bird International Symposium, edited by Cie Bonsall. john Donald, Edinburgh, Scotland, m press. 


Lachenbruch, P. A., and M. Mickey 
1968 Estimation of Error Rates im Discriminant Analysis. Technometric: 10:1 -11 


Lafferty, Robert H., 11 
1981 Distribution of Archacological Materials. In Settlement Predictiom im Sparta, by Robert H. 
Lafferty Il, Jeffrey L. Otinger, Sandra C. Scholtz, W. Frederick Limp, Beverly Watkins, and 
Robert D. Jones, pp. 163-206. Arkansas Archacological Survey Research Series No. 14 


Lafferty, Robert H., 11, Jefirey L. Otinger, Sandra C. Scholtz, W. Frederick Limp, Beverly Watkins, 
and Robert D. Jones 
1981 = Settlement Predections in Sparta. Arkansas Archacological Survey Research Serves No. 14 


Landgrebe, David A. 
1978 The Quantitative Approach: Concepts and Ratronale. In Remote Semung: The Quantitatire 
' , -H 











KVAMME 


410 





would allow some flexibility in travel plans. ‘i he sequential test would then be 
assessed after surveys of each group had been completed. The principal effect on 
the procedure would be to increase the average sample size needed to arrive at a 
decision by an amount equal to the size of each group (Dixon and Massey 1957:310). 


Base Rate Probabilities 


Previous sections have presented a number of procedures for assessing the 
performance of a model through independent samples and significance tests. Before 
we can fully assess a particular modei or understand how well it will work in 
practice, we must take into account one final domain—the base rate or a priori site 
and nonsite probabilities, which have been mentioned several times in previous 
sections. By using these probabilities one can make estimates of the probability of 
site class membership within a region mapped by a model or, alternatively, estimate 
tie probability of site class membership at specific foci’ within a region of study. 

Arci.2eclog:cal sites are rare phenomena. This can be clearly demonstrated by 
examining the a priori probability of site occurrence within a region—the purely 
chance probability of site presence considering no other information. This probabil- 
ity is usually extremely low, ranging in the vicinity of | to 5 percent or even much 
less. This probability can be estimated as 


total area covered by known sites 





P(site) = P(S) = 


total area surveyed 


The total area covered by known sites is most accurately estimated by measuring 
site area in the field or by determining the area of the dots and polygons usually 
used to record site locations on maps. If a small grid (¢.g., one of 50 by 50 m cells) is 
superimposed over the study region and the number of grid cells that contain 
cultural remains are counted, then P(S) can be estimated simply by dividing the 
total number of cells with sites by the total number of field-inspected cells. Reliable 
estimation of P(S) always requires fairly large samples. It is important to note that 
the gridding method can cause an overestimate of P(S) when a large grid size is used. 
A large cell is more likely to contain a site than a smaller one, and this causes the 
relative number of cells with sites to increase while the total number of cells is 
decreased. 


The Glade Park data can be used once again to reveal that 157 of the 2432 
surveyed analysis units (each measuring | ha) contain sites, yielding an estimate of 
P(S) = 157 2432 = 0.065. Most of the sites discovered, however, were very small lithic 
scatters covering an area much smaller than a hectare, which suggests that the 
above figure is an overestimate. Examination of the site records indicates that the 
157 sites occupy a total area estimated at about 538,000 m?, or an average size of less 
than 3500 m? (compared to 10,000 m? in a hectare). Since 38 quarter-sections occupy 
approximately 25 million square meters, a better estimate of the actual base rate 
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probability of archaeological site presence in Glade Park might be P(S) = 
538,000, 25,000,000 = 0.021. Since Glade Park is one of the archacologically nchest 
areas in Colorado, this figure is relatively high. 

Incorporation of prior probabilities into a classification model will decrease the 
eperall rate of misclassification (Morrison 1976-235), but when the pror probability of 
one group is extremely low (as in the archaeological case) the error rate for the 
low-probability class is increased substantially by this procedure (Morrison 
1969: 160; Overall and Klett 1972:263). The extreme magnitude of the pnor probabili- 
ties wn such cases “overpowers” the estimated probabilities that are conditional on 
environmental and other data, with the effect that the final model esseniually 
utilizes only the prior information in classifying observations. It is best, therefore, 
not to include prior probabilities in model development but to reserve them for 
model performance assessment (see below, however, for a discussion of the use of 
prior probabilities in estimates of probabilities at specific loci). Some disciplines 
actually manufacture a priori probabilities, arbitrarily setting P(S) = 0.9, for exam- 
ple, in an effort to increase the chance that a rare group of interest will be correctly 
identified by a predictive classification model (Schowengerdt 1983:43). This proce- 
dure is mathematically equivalent to the cutoff point adjustment approach 
explained in an earher section. 


Estimating Site Probabilities in Regions 


Since archaeological sites are a valuable resource, it is more important for 
archaeological locational models to classify site-present locations correctly than for 
models to classify site-absent locations correctly. We would like, therefore, to 
produce models that classify a major proportion of sites correctly, say 90 percent. 
This can be accomplished using the method of modified cutoff points described 
above and the nonsite data can be used to indicate the approximate percentages of 
the study area within which a specified percentage of sites should occur. But in 
order to determine other dimensions of model performance, such as the site 
densities that can be expected, we can use prior probabilities with the model 
performance indications obtained through the cutoff point adjustment approach 
and Bayes’s Theorem (Hays 1981:39-41). More specifically, given an area of a region 
mapped by a model as site-likely or site-favorable, the following procedures yield an 
estimate of the probability of site class membership within that modeled region and 
an estimate of the probability of site class membership outside the modeled region. 


To illustrate this procedure the percent correct statistics yielded by applying 
the yackknifed Glade Park model to the independent test data (Table 8.12b) are 
used. These data indicate (at a model cutoff point of p = 0.4) that approximately 8&6 
percent of the sites should be classified correctly (Table 8.8b). Let S§ be the event 
that a site is actually present, and let Mf be the event that the model indicates that a 
site is present. We want to find the conditional probability, P(S|M), of site class 
membership given that the model suggests site presence. If we use the grid-based 
analysis, the a priori probability of site presence at a location is estimated as P(S) = 


4il 
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0.065 (see above); then P(SS = 0.935, where S$ indicates the complement of site 
presence, i.c., site absence. The probability that the model will indicate a site given 
that a site is actually present is P(M|S)= 43,50 = 0.86, and the probability that the 
model will indicate a site given that a site is mot present is P(M|S‘ = 50/87 = 0.575 (data 
from Table 8.8b). According to Bayes’s Theorem, 


P(MS)P(S) 
PIS\M) = 
SM) P(M|S)P(S) + P(MISOP(S) 


(0.860.065) 
(0.860.065) + (0.575\0.935) 








Consequently, in the portion of the study region that this model would map as 
site-likely (at the p = 0.4 cutoff), the probability of site class membership at any 
location (hectare cell) within the region is P(S|M) = 0.094, which is 0.094/0.065 or 1.45 
times better than a purely chance model (P|S| = 0.065). On the other hand, the 
probability of site class membership given that the model does not indicate a site 1s 
roughly 


P(M4S)P(S) 
P(M4S )P(S) + P(M4SC)P(S¢) 
(0.140.065) 
(0.140.065) + (0.425)(0.935) 





P(S|IMS) = 





In the portion of the environment not mapped by the model as site-likely the 
probability of site class membership is only P(S|M‘ = 0.022. This suggests that 
haphazardly throwing darts at a map of the region (a purely chance model) might be 
three times (0.065/0.022) more probable of indicating a site than the probability 
produced by the model in this subarea. Moreover, the probability of site class 
membership in the mapped site-likely region 1s more than 4.2 times (0.094/0.022) 
more likely than the probability of a site occurring in the site-unhkely region. (It 1s 
emphasized, once again, that these procedures can be extended to problems 
involving multiple site classes.) 


The meaning of these statistics is made clearer by imagining that the Glade 
Park model (at the p = 0.4 cutoff) is mapped over the entire study region (roughly 
160,000 ha), much like the mappings in Figure 8.8. About 6.5 percent of these 
hectare-unit locations (P|S] = 0.065), or 10,400 of them, will contain sites, and about 
93.5 percent (P[S¢] = 0.935) or 149,600 will not (Figure 8.14), as estimated by their 
base rate chances of occurrence. Of the 10,400 locations that contain sites, the 
predictive site location model (at the p = 0.4 cutoff point) will (as indicated by the 
independent tests) correctly classify about 86 percent (P[M|S| = 0.86) or 8944 as 
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contaimung sites and wall incorrectly classify about 14 percent (P|.M“S | = 0.14) or 1456 
as belonging to the site-absent category. Of the 149,600 nonsite locations, 42.5 
percent (P{M4S°| = 0.425) or 63,580 will be correctly identified and about 57.5 
percent (P[Af\S| = 0.575) or 86,020 will be classified as sites. Thus, although the 
model assigns 8944 + 86,020 = 94,964 locations as site-likely, only 8944 of these 
actually contain sites, or 8944 94,964 = 0.094 = P(S|M), a roundabout, and hopefully 
more understandable, presentation of Bayes’s Theorem. These calculations are 
lustrated mn Figure 8.14. 


It ss umportant to recognize that the predicted 94,964 site-likely hectares of the 
model can potentially be mapped through use of computer mapping techniques 
(see above and Cnapter 10). About % percent of all sites would occur within the 
approximately 57 percent of the total land area that 1s mapped by the model as 
having high site sensitivity. The areca outude the mapping would form a low- 
sensitivity zone covering about 43 percent of the land area and would conta only 
14 percent of all sites. In fact, 100(63,580) (63,580+1456) = 97.8 percent of the 
locations rn the low-sensitivity zone would not contain sites. About onc location in 
every 10 would contamm a site in the high-sensitivity zone, but only one location in 
every 45 would contain a site in the low-sensitivity region. These statistics, of 
course, are based on the model using the p = 0.4 cutoff and on accuracy rates 
obtained from one sample (Table 8.8b). Performance indications such as these will 
vary depending on the cutoff point and accuracy estimates used. 


Estimating Site Probabilities at Specifu Loa 


Cultural resource managers often wish to estimate the probability of archaco- 
logical site class membership given the data measured at a particular location, such 
as a single hectare grid cell, rather than simply estimating the probability of a site 
within a larger region, such as a high-sensitivity zone as a whole. Probability 
estimates for specific loci also require use of the a prion probabilities P(S) and P(S*). 
These probabilities are used in conjunction with modifications of the formulas for 
estimating site probabilities conditional on environmental and other measurements 
(given in the section “ Application Comparison of Quantitative Locational Models” 
above). 


It was demonstrated with empirical test evidence that if the Glade Park 
jackknifed model (at the p = 0.4 cutoff) were to be mapped, the probability of site 
class membership within the mapped site-likely region would be about 0.095, and 
the probability of site class membership outside the mapped region would be about 
0.022. These estimates, one for the entire area mapped by the model and one for the 
rest of the study region, respectively, serve as a kind of “average” probability 
figure for these portions of the study area. In other words, if we know that a location 
falls somewhere within the region mapped by the model as site-likely, then we can 
say that the probability of site class membership is about 0.095. This techmaque 
makes use only of the knowledge that a location 1s, or is not, in a modeled region as a 
whole, mapped at some cutoff pount; it does not consider any particular factors, such 
as environmental characteristics at a particular location. 
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It 1s also possible to estimate the probability of site class membershup at a 
specific locus (land parcel) by sgnoring the mapping and considering the environ- 
mental characteristics of the locus together with the base rate or chance probability 
of a site (but for complete validity this procedure technically requires that all the 
assumptions of the classification model used are met). The discriminant analysis 
described in the application comparison section (above) provides an example. That 
analysis yielded a discriminant score of D; = 2.2009 for the environmental data 
measured at the location of site 5L.A5364 (Table 8.3). This location’s probability of 
membership in the site class, conditional only on the environmental measurements 
and assuming that the assumptions of the discriminant model were fully met, was 
estimated as 


OSD; - Dy) 
,,* = 0.873 
DSDj - Ds 4 -0-5(Dj - Dus}? 





(recall that D, =~ 0.8304 and Dg, = -0.1936). Again, this probability is estimated from 
the measurements only and does not consider the base rate proportions of sites and 
nonsites in the area. A modification of this formula to incorporate prior probabili- 
ties, P(S) and P(S®), yields 


Pr $e 0-HD; - DP 
P(S jp O-KDj - DP + PSC) O-KD; - Das 





-;* 


and the estimated probability of site class membership at this location, incorporat- 
ing both environmental and base rate data, is approximately p = 0.323 (using P[S| = 
0.065 and P[S*] = 0.935). The lower figure results from the inclusion of the prior 
information on site proportions and provides a more realistic estimate of anticipated 
probabilities. Similar modifications of other formulas (¢.g., logistic regression) can 
be found in standard statistical texts. 


MODEL REVISION 


Analyses described in the previous sections suggested that about 8 percent of 
Glade Park sites might be predicted correctly by mapping a high site-sensitivity 
zone that covers approximately 57 percent of the total Glade Park land area. This 
particular result may not seem very impressive as an illustration of the power of 
empirical site location models. It was noted earlier, however, that Glade Park 
contains one of the highest site densities in Colorado. This fact, together with these 
area performance indications, suggests that Glade Park was a very favorable place 
for prehistoric peoples to perform activities and, in the process, create archacologi- 
cal sites. The 57 percent figure suggests that about 57 percent of the land area of 
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Glade Park contains environmental characteristics that are very simular to character- 
istics exhibited by known sites (in terms of a partitioning of measurement space, 
Figure 8.5b). This means that prehistoric peoples had a wide choice of settlement 
locations within the region. Other regions and studies do not indicate such favora- 
ble conditions for prehistoric inhabitants. The Colorado plains study described in 
eather sections (also see Kvamme 1984) found that sites were restricted primarily to 
a nafrow zone around major drainages. Statistics obtained through independent 
testing suggested that about 90 percent of the sites might occur in only D percent of 
the total land area of that study region. In a study in central Utah more than 90 
percent of the sites were estimated to occur in about 15 percent of the study re_-on’s 
area (Reed and Chandler 1984-80). 


It ss through the use of nonsite control data that these area projections can be 
made. Because many nonsite locations exhibit environmental characteristics sdenti- 
cal to those of sites (and thus fall on the site side of the decision boundary in the 
measurement space), and because they are extremely prevalent, these approximate 
area calculations can be made. Although much of the (nonsite) environment may 
possess characteristics similar to those exhibited by known site locations, much of 
the (nonsite) environment is very dissimilar, which allows the designation of 
substantial portions of the environmen: as a low site-sensitivity zone. Thus, at 
Glade Park 43 percent of the land area could be delineated as having low site 
sensitivity, a result that would include only about 14 percent of the prehistoric sites 
within that zone. At present, no method has been demonstrated that can discrimi- 
nate site-present from site-absent locations in the site-favorable portion of a 
measurement space. In other words, given that there are many locations in the 
environment that possess environmental and other characteristics identical to those 
exhibited at site locations, there presently is no procedure that can differentiate 
between sites and nonsites with identical environmental and other characteristics. 


A projection like “90 percent of the sites will occur i 90 percent of the land 
area” offers no gain. In assessing whether gain is sufficrent, such factors as test 
sample sizes and confidence interval widths should be considered. If it 1s deemed 
that a model is inadequate, new variables that potentially offer better predictive 
power might be investigated or alternative samples might be examined and a new 
model developed. It also might be determined through testing or use that a site 
location model consistently misclassifies certain types of sites. In this case a model 
designed specifically for that site type might be considered. 


The method of sequential analysis (described above) is specifically designed to 
indicate the need for model revision in an ongoing research framework. When test 
sampling indicates that model-predicted site densities exceed or fall below specified 
limits, the model should be rejected. If this should happen the need for model 
revision is indicated. Even for already tested models, ongoing testing through 
sequential methods might be conducted as future archacological surveys are carned 
out and new information becomes available. 


The use of geographic information systems techniques (computer data bases 
encoded with environmental and other geographic information; see Chapter 10) 
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At present, we have not achieved any of these goals completely. The preced- 
ing chapters of this volume point out that there 1s still a great deal of archacological 
research to be done. There 1s not, in fact, agreement within the profession even 
about what predictive modeling means or about definitions of such important basx 
operational terms as «te or spitem. This may seem unfortunate, but i need not be 
thought of as being so. Learning how to do predictive modeling and archacology in 
general is a great adventure that we have just begun 


his chapter will explore the possible role that remote sensing can play i that 
adventure and describe attempts by archacologists to use remote sensing to project, 
predict, and explain the archaeological record, the operation of past behavior and 
behavioral systems, and the things that separate these two domains. 1 his chapter 
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DEVELOPMENT AND TESTING OF QUANTITATIVE MODELS 


might eventually lead to interactive model building, testing, and revision as an 
ongoing process. If environmental and other vanables relewant to archacological site 
location models are encoded in the data base along with known site locations, 
predictive models of many forms and waneties, such as models for multuple site 
types or temporal periods, could be generated mstantancously. As new sites and 
nonsites are discovered, additional model tests could be performed or these data 
could be incorporated mmto the data base to update existing models. As models 
change, so do the results of models. Computer graphic techmques can allow new 
maps of model results to be rapidly and cost-eflectively produced so that the most 
current information can be used. 





Many of the results presented here stem from a history of personal mvolvement om archaeology al 
predctve modeling that spans the past decade. Tha mvolvement owes mts origins to a 1979 contract 
awarded by the Bureau of Land Management, Grand junction Dutrat Office, to Nuckens and 
Associates of Montrose, Colorado ( where | was employed), tor a regsonal preductive model. The Grand 
Junction Dutrct Office aga supported my work m 1982-1983 for a larger study that resulted m 
considerably umproved methods, including extensive survey and model testing, and a procedural 
manual. Doctoral work at the Unrersty of Calsforma at Santa Barbara exposed me to geographx 
mformation systems and remote senung technology, the quantstatrve expertise of Albert C. Spauld- 
mg, and the hunter-gatherer setiiement ideas of Machael A. Jochem, and « resulted m a dissertation on 
archacologxal predsctrwe modeling m 1983. The Unwersty of Denver's Pron Canyon Archacologial 
Project, sponsored by the U.S. Army, called for extenswe use and apphcation of archacologwal 
predactive nodels and GIS technology. My mvolvement wah that project from 1983 to 1985 allowed 
further refinement and development of modeling methods with a very large data set, and production 
of GIS capabilites compatible wath archacologscal analy ws and modeling needs. In recent years at the 
University of Anzona my teaching of a spatial analyse class, whoch focuses on GIS and archacological 
modeling, has torced me to express the basx ideas and methods on an casy -to-wnderstand level. Of 
more mmportance, however, are the many mughts and apphcations of those technologees that my 
students have given me. Chapters 7, 6, and 10 of the volume owe much to the abowe persons and 
Istit utrons 


if there ws any gauge of the success of one's work, 1 om how much at « used. Although f have not 
received royalties as of yet, | am frequently sent copies of proyect reports that utshze (and om many cases 
copy directly) the methods summarized m Chapter 6. These studres, largely stemmueng from cultural 
resource management contexts and pertormed tor vanous government agencies, represent a stagget- 
img amount of work (probably approaching 100 si udses and projects). My hope ms that the results of thes 
work will be used responsibly by management personne! as tools to better care for, preserve, and 
protect cultural resources. Hf they are not used responsibly, then the tault hes with management (not 
the models) and we must focus our attention on definmng responsibility 


1 wish to thank the followwng mdividuals in particular for them contmbutsons to Chapters 7, 6, and 
10 of thes volume. JoAnn Christem devoted consderable eflort m manuscript production over several 
rewrites, m digitization of much of the data for the GIS work, and mm giving moral support over this 
long and trying proyect. | am grateful to Mike Jochem tor allow mg me to present some of hrs Mesolithic 
data wm these chapters. F mally, Dan Marten deserves special prawe tor his support of the whole volume 
and particularly for bes comtenued encouragement of my work 
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REMOTE SENSING IN PROJECTION AND PREDICTION 


Remote sensor data can be analyzed by human interpreters who (4) sumply 
look at visual products or (6) use magnifying and projecting devices to examine 
minute details of an image or (¢) employ stereoscopes, which produce a three- 


dimensional image from partially overlapping photographic prints. Stereoscopic 


images can also be used to produce orthophotos (photographs in which errors of 
scale and orthographic errors have been removed) and photogrammetric maps, the 


most familiar of which are USGS topographic maps, virtually all of which are made 
from acral photographs. 

The advent of computers capable of digital image processing has made avail- 
able new and versatile forms of remote sensor data analysis based on mathematical 
manipulation of the matrix of picture elements (pixels) that constitute a digital 
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would represent a ground distance of 12,000 cm or 120 m. The scale of data recorded 
by such digital devices as multispectral scanners us determined similarly by distance 
(altstude) and the mstantancous field of view of the scanner (Lillesand and Kiefer 
1979: 396) 


Remote sensor data resolution is a more complex concept than scale because i 
can be of three basic sorts: spatial, radiometric, and temporal. Spatial resolution 
refers to the minimum size of actual objects that can be discerned im an unage. This 
vanes with recording medium parameters (photographic emulmons have the high- 
¢ t resolution of any remote sensing recording medium) but im all cases 1s also a 
direct function of umage scale. The smaller the scale of an wunage (that us, the smaller 
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Existing remote sensor data are of course the least expensive to use, tor the 
costs of thew acquisition have already been met by othe rs, and copes of the results 
can be obtained cheaply. Acrial photographs at a vareety of scales and m several 
emulsons — black and white, color, and Mack-and-whuete of color mtrared —have 
been taken of almost all the contunental U meted States and much of the world. Many 
actial photographs are available from government agencies st nominal cost (for a het 
of these see May 1978 of Ebert 19794). The carhest svetemata vertical acral 
photograph coverage of parts of the United States was enetiated mm the 1930s by the 
Soil Conservation Service, Department of Agnculture, and beginning at about the 
sare tume the U.S. Geologwal Surwey began blanket cowerage of the country for 
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Chapter 9 


REMOTE SENSING IN ARCHAEOLOGICAL 
PROJECTION AND PREDICTION 


During the preparatory stages of this volume the authors and editors were not 
certain that a chapter on the potential and use of remote sensing in archacological 
predictive modeling would be entirely appropriate. The purposes of this book are to 
explore some of the con:plexities of predictive modeling, to examine some of the 
biases inherent in our present methods and data (see particularly Chapter 7), and to 
suggest directions that archaeological explanation will have to take in order to 
achieve successful and scientifically useful predictions (Chapter 4). There was some 
concern that the inclusion of a chapter on using remote sensing to do predictive 
modeling might imply that technical means now exist by which predictions can 
easily be made, that all one has to do is plug existing archaeological and remote- 
sensing-derived environmental data into a computer and a predictive model will 
emerge. 


It is clear from the preceding chapters in this volume that this is not the case. 
Predictive modeling is an area of great interest to archacologists and managers alike, 
and perhaps more than any other fact, this interest indicates that we are just 
beginning to understand how to predict and model. One of the most universal 
cultural patterns 1s that people worry about and try to predict things in inverse 
proportion to how well they can really predict them. Nightly weather forecasts, for 
instance, dwell heavily on such questions as whether it will rain tomorrow, and the 
resultant predictions are of mixed success at best; there is never any discussion 
about whether the sun will come up in the morning. When we finally do perfect 
archacological predictive modeling, there will probably be little discussion about it 
at meetings or in the literature. As discussed in Chapter 4, however, before we 
achieve success in prediction we will have had to learn many other things—how 
human systems are organized at several levels; how deposiiion and postdepositional 
processes affect the preservation and visibility of archaeological materials, and how 
this varies across the landscape; and how to make our methods of data discovery, 
collection, and analysis compatible with what we want to know about the past. In 
short, by the time we know how to do prediction we will also have discovered how 
to explain the archacological record, and by the time we know how to predict, we 
may not need to do so anymore. 
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At present, we have not achieved any of these goals completely. The preced- 
ing chapters of this volume point out that there is still a great deal of archacological 
research to be done. There is not, m fact, agreement within the profession even 
about what predictive modeling means or about definitions of such important basic 
operational terms as ste of system. This may seem unfortunate, but « need not be 
thought of as being so. Learning how to do predictive modeling and archacology in 
general 1s a great adventure that we have just begun. 


his chapter will explore the possible role that remote sensing can play i that 
adventure and describe attempts by archacolog:sts to use remote sensing to project, 
predict, and explain the archacological record, the operation of past behavior and 
behavioral systems, and the things that separate these two domains. This chapter 
will begin with a review of the basics of remote sensing — what it 1s, the methods and 
techniques by which 1 1s carried out, the data that it yields, and its capabilities and 
mutations for archaeological projection and prediction. Relevant literature and 
contemporary attempts at incorporating remote sensing mm archacological proyec- 
tion and prediction will be surveyed, and the strengths and weaknesses of t..ese 
approaches discussed. Finally, some suggestions will be made about new, poien- 
tially productive applications of predictive remote sensing. 


FUNDAMENTALS OF REMOTE SENSING 


Platforms, Recording Devices, Data Types, and Analyses 


Remote sensing 1s the science and technology of obtaining information or data 
about physical objects and the environment through the process of recording, 
measuring, and interpreting photographic images and patterns of electromagnetic 
radiant energy (Ebert 1984:293). The most familiar remote sensing methods arc 
photographic, and aenal and ground-based photography has been employed in 
archaeology since the beginnings of the discipline. The term remote semung was 
comed in the late 1960s in response to the need for a term that could include both 
simple photographic data-collection techniques and the use of other, more exotic 
data sources, such as satellite and airborne multispectral scanners and microwave 
(radar) sensors, in a unified technical field. 


Remote sensing can best be understood when broken down into several of its 
component parts. Remote sensing platforms (the vantage points from which data 
are collected) rang from the surface of the earth to low-altitude camera supports, 
such as bipods and tripods, to balloons, aircraft, and satellites hundreds of miles 
above the landscape. The devices with which remote sensor data are collected 
include active radar transmitters and receivers, proton magnetometers, cameras, 
and scanning devices recording reflected radiation. Remote sensor data can be 
recorded by these devices photochemically (1.¢., with photographic emulsions) or 
clectromecally om either analog or digital formats, and in one or more wide or 
restricted wavelength bands. 
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Remote sensor data can be analyzed by human mterpreters who (4) simply 
look at visual products or (6) use magnifying and projecting devices to examine 
minute details of an mage or (<) employ stcreoscopes, which produce a three- 
dimensional image from partially overlapping photographic prints. Stereoscopic 
mmages can also be used to produce orthophotos (photographs mm whach errors of 
scale and orthographac errors have been removed) and photogrammetric maps, the 
most familiar of which are USGS topographi maps, virtually all of which are made 
from aerial photographs. 

The advenz of computers capable of digital mage processing has made avail- 
able new and versatile forms of remote sensor data analysis based on mathematical 
manipulation of the matrix of picture clements (pixels) that constitute a digital 
mage. Each pixel making up a digital image has a numerical value, which expresses 
the reflectance of the represented portion of the earth's surface. These values can 
be subjected to filtering, classification, histogram stretching (contrast enhance- 
ment), density shcing (density range simplification), power spectrum analysis, 
geometric correction, resampling, pattern recognition routines, and virtually any 
other matrix operation. While digital data are directly derived by most scanning 
devices, photographic and other analog data can be digitized into pixels for digital 
analysis, and conversely, digital data can be converted into visual images for 
photommterpretation. 

Clearly , remote sensing encompasses a great many methods and techmiques; it 
is beyond the scope of this chapter to describe and explain cach of them. The 
fundamentals and details of remote sensor platforms, data collection devices, data 
types, and data analysis devices and methods are covered exhaustively im many 
available sources to which the reader should refer for more complete information. 
One of the most comprehensive of these sources is the American Society of 
Photogrammetry’s Manual of Remote Sensing (Colwell 1983); one chapter im that 
volume (Ebert and Lyons 1983) focuses on archaeological, anthropological, and 
cultural resource remote sensing. Other excellent general remote sensing reier- 
ences are Avery (1977) and Lillesand and Kiefer (1979). A more concise summary of 


general archacological applications of remote sensing can be found im Eber: (1984). 


Scales and Resolution 


Regardless of data source or type, there are two basic properties shared by all 
remote sensor data: wale and resolution. The scale of an wmage refers to the relation- 
ship between the size of the mage and the actual size of the scene that the mage 
represents. The scale of an image is determined by the distance between the data 
collection device and the scene being imaged and by the field of view of the data 
collection device. For aerial photographic data, for mstance, the scale equals the 


focal length of the lens divided by flight height. The scale us generally expressed as a 
ratio of i: (distance on the photograph:actual scene distanc 4; Avery 1977243). Asan 


example, in a photograph with a scale of 1:12,000, | cm on the photographic mage 
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would represent a ground distance of 12,000 cm or 129m. The scale of data recorded 
by such digital devices as multespectral scanners us determined similarly by distance 
(altstude) and th: mstantancous field of view of the scanner (Lillesand and Kiefer 
1979-396). 


Remote sensor data resolution 1s a more complex concept than scale because 
can be of three basic sorts: spatial, radiometric, and temporal. Spatial resolution 
refers to the minimum size of actual objects that can be discerned im an umage. This 
vanes with recording medium parameters (photographic emulsons have the high- 
¢ t resolution of any remote sensing recording medium) but mm all cases 1s also a 
direct function of unage scale. The smaller the scale of an image (that is, the smaller 
the fraction of umage sizesobyect size), the lower the resolution. Since larger scale 
mages cover a smaller area than smaller scale mmages, there is always an economic 
trade-off between scale and spatial resolution im remote sensing. 


Radiometric resolution refers to the portion or portions of the electromagnetic 
spectrum recorded m remote sensor data. Panchromatx photographs record the 
same portion of the electromagnetic spectrum seen by the human cye; other 
photographx emulsions record ultraviolet or near mfrared radiation. Microwave 
(radar) devices record wavelengths much longer than ultraviolet hight, while 
scanners can record visual through far-infrared spectra. Film filter combinations 
can restrict the portions of the spectrum that cameras measure, and multiband 
camera clusters have been used to produce multispectral photographic data. Mults- 
spectral scanners (MSS) record more than one wavelength band; the example of 
multispectral scanner data most familiar to and most frequently used by archacolo- 
gists os Landsat, which os discussed at greater length below. 


Temporal resolution is a measure of how frequently a scene 1s imaged through 
repeates aerial photographic overflights or satellite sensor passes. Comparison of 
actial photographs from the 1930s with those taken more recently provides one 
example of temporal resolution, but the term takes on a clearer meaning im 
reference to regularly repeated satellite data collection. The Landsat satellites, for 
mstance, cover the entire surface of the earth (cloud conditions permitting) about 
every 18 days. Temporal resolution 1s umportant because the surface and near 
surface of the earth changes on both large time scales (¢.g., geological and geomor- 
phological change) and small time scales (¢.g., seasonal variation, vegetational 
change, and modern development), and change at either scale may be wmportant 


archacologically. 


Remote Sensor Data for Proyection and Prediction 


Most archacological uses of remote sensing that can be characterized as 
projective or predictive make use of two general data sources: acral photographs 
and airborne or satellite multispectral scanner products. Archacologists have madc 
use of both existing data and data acquired specifically for their projects. 
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Exusting remote sensor data are of course the least cxpensive to use, for the 
costs of their acquisition have already been met by others, and copies of the results 
can be obtained cheaply. Acnal photographs at a varmety of scales and m several 
emulsions — black and white, color, and black-and-whate or color infrared—have 
been taken of almost all the contunental Uneted States and much of the world. Many 
acral photographs are availabic from government agencees at nominal cost (for a hist 
of these sce May 1978 or Ebert 1984). The carhest systematic vertical aerial 
photograph coverage of parts of the United States was enstiated in the 1930s by the 
Soil Conservation Service, Department of Agnculture, and beginning at about the 
same tame the U.S. Geological Survey began blanket covcrage of the country for 
topographic mapping purposes. Since that tume, other government agencies have 


been taking acrial photographs at an ever-sncreasng rate. Generally at least one and 
often five or ten different types of aerial photographs will be available for a given 


atea of wnterest. For many purposes and m many project areas, it os likely that 
existing actial photographs will meet at least some of the archacological or manage- 
nal remote sensing needs. It 1 also hkely, however, that no existing acrial photo- 
graphs will satisfy all perceived needs. Most government agency acral mapping 
photographs are taken at scales smaller than 1:15,000(1 cm on the photo represents 
150 m on the ground), and many are at very small scales, up to 1:400,000. 


Sometimes it may be necessary to acquire new, project-specific aerial photo- 
graphs to meet certain scale and resolution needs. It may also be the case that the 
tume of day or year im which exesting photos were taken, or thei emulnons, leave 
something to be desired. Flying new photos may at first appear to be an expensive 
solutson, and certainly it 1s more expensive than buymng photographic prints from 
the USGS or other agencies. Effectiveness must also be considered, however, and 
often this concern may outweigh high acquisition costs, especially df no other 
suitable photographs are available (Avery and Lyons 1981:18). 


The other remote sensor data source commonly employed im archacological 
efforts toward projection and prediction consists of mages derived from the digital 
multispectral scanners aboard the Landsat satelhtes. The first Landsat (then called 
ER TS-1) was launched by the USGS mm 1972; since that tume four Landsat satellites 
have been launched and have provided milhons of umages of the earth's surface. 
Landsats | and 2 orbut the earth 14 tunes a day m a corcular orbit about 900 km above 
the earth; each covers a 185 km swath with little sde-to-side overlap at the equator 
and as much as 85 percent at 81° north and south lattude. The satellites’ orbits are 
sun-synchronous, and umages are always collected at mid-morning. Landsats | and 2 
collect data im four bands deugned to provide a contrasteng basis for discriminating 
between water and land and among difterent sorts of vegetation cover and different 
surficial deposits. Landsat data are resampled and corrected after being sent to 
earth, and the resulting resolution of Landsat | and 2 data ws 80 by 80 ca pixels. 
Landsat 3 has the same radiometric resolution om four bands, with the addition of a 
thermal mnfrared band and a somewhat higher (55 by 55 m) resolution. Landsat 4, 
launched mm 1984, has seven spectral bands and even greater resolution. For a 
detailed discussion of the parameters of the Landsat satellite sensor systems and 
their products, see Lillesand and Kiefer (1979:530-583). 
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The Landsat satellites are designed to provide versatile data with medium 
spatial resolution, high temporal resolution, and radiometric resolution that would 
make their data ideal for earth resources studies and assessments. Even with a 55 by 
55 m pixel spatial resolution, it is clear that very few archacological sites or materials 
will be visible on Landsat MSS data. Landsat data are, however, ideal for analyzing, 
measuring, and mapping what archacologists think of as “independent variables,” 
whether these be assumed landscape preferences of past people, ecosystemic 
variables affecting the placement of tuman systems and their components, or 
depositional and postdepositional processes that affect the preservation or visibility 
of the archaeological record. It is small wonder that archacologists interested in 
predicting have made use of Landsat in a variety of ways, and it is likely that Landsat 
data and perhaps data from similar, soon-to-be-launched satellite sensors (the 
French SPOT, for instance) will constitute a major resource for such experiments. 


Remote Sensor Data Analysis Methods and Techniques 


T wo basic interpretive or analytical methods have been used by archacologists 
who have incorporated remote sensing in their projective and predictive experi- 
ments. The first of these is visual interpretation. As noted above, visual interpreta- 
tion is accomplished by looking at an image in one of several ways. Acrial photo- 
graphs or visual images derived from other analog or digital remote sensing sources, 
such as Landsat, can simply be inspected without optical aids, with images being 
viewed either singly or overlaid in mosaic form. The interpreter, making use of 
internalized knowledge about how certain landforms or other characteristics of the 
environment should appear, makes judgments about areas or zones of differential 
occurrence of these characteristics on the basis of photographic “‘cues,” including 

one, color, texture, pattern, shape, and retutionship of one photographically 
imaged feature to another (for a more complete discussion of these image proper- 
ties, see Ray 1980:6-13). 


In stereoscopic photointerpretation, an interpreter views two partially over- 
lapping photographs, each taken from a different position along a flight line; this is 
usually accomplished with the aid of a stereoscope, which allows the viewer to see 
one photograph with each eye. ~ his results in the perception of a three-dimensional 
image in which the vertical dis. «sion is exaggerated because of the wide spacing of 
the points from which the stereo photos were taken relative to the spacing between 
human eyes. Small topographic differences are thus easily distinguished, giving 
clues to landform and the identity or nature of other characteristics of the scene 
viewed; Ray (1980: 14) estimates that topographic differences of as little as | ft can be 
discerned by the average interpreter using a stereoscope and 1:20,000 scale aerial 
photos. 


Photointerpretation might be thought of as being subjective, anc to a certain 
extent it is. Human interpreters, especially those with experience in photointerpre- 
tation, possess extensive internalized information about what different landscape 
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features and other environmental characteristics “should” look like and may 
stretch these interpretations or generalize boundaries. On the other hand, this 
internal information allows an interpreter 10 make supported guesses—usually 
correct ones—about phenomena not previeusly experienced. This 1s something 
that even the most sophisticated image-processing machines cannot do and 1s the 
reason that image analysis cannot, at least at present, be totally automatic. 


It should be noted at this point that all map making is a process of interpreta- 
tion. Most topographic maps in use today, including the USGS topographic maps 
used in many experiments in archaeological projection and prediction, are compiled 
using aerial photographs as a primary or exclusive data source. The topographic 
contours are measured and drawn frum the three-dimensional data contained in 
vertical-axis, stereo aerial photographs using optical-mechanical or analytical photo- 
grammetric plotting devices. While * his process is, to a certain extent, subjective, it 
18 Quite accurate and precisely repeatable. The indicated degree of slope may be less 
so, however, as contour lines too close together to be separated during printing are 
often artificially spread apart. Almost ail the rest of the data shown on topographic 
maps are subjectively interpreted and generalized—including the intermittency 
and even the existence of water im streams or springs, and the boundaries of forested 
vs nontforested lands. Maps are interpretations, and when using them for a specific 
purpose one must ask what the purpose of the interpreter was. For this reason it 
may well be best to rely on one’s own “first generation” interpretation from aerial 
photographs rather than on the standardized subjectivity of USGS maps for mea- 
surement of landtorm and cavironmental variables. 


The second class of methods used by archaeologists in analyzing remote sensor 
data for projective or pred ctive purposes 1s encompassed by digital analysis. Digital 
analysis is done by subjecting a matrix of pixel values representing an image to 
numerical analysis, usually using a computer. Computer-assisted image analysis 
procedures include datz preprocessing (image sampling and reconstruction, noise 
removal and reduction, and removal of image blur and other distortions; Billingsley 
1983), pattern recogn’tion (Haralick and Fu 1983), the correction of geometric 
distortions in images (Bernstein 1987), digital filtering for edge enhancement, 
histogram manipula*ions for contrast enhancement, and classification of image 
characteristics through cluster_ag analyses (Estes et al. 1983). Many of these 
operations have already been performed on Landsat MSS digital data when it 1s 
received from EROS. in addition, digital data can be processed using any other 
numerical or statistical procedure that can be performed on matrix data, and in this 
manner pixel spectral intensity values can be compared with other values (for 
instance, observed densities of archaeological discoveries or materials). The 
archacological applications of remote sensor data to projection and prediction 
discussed later in this chapter have used either cluster-based classifications of pixels 
of raw pixel data and consist of comparisons of these image data with archacological 
data distributions. 
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Some Limitations of Archaeological Remote Sensing 


Several archaeological experiments in which remote sensing data and methods 
have been applied to projection and prediction will be discussed below. The 
potential utility of remote sensing in such research is high, but a short discussion of 
some of the limitations of remote sensing in archaeology is necessary for at least two 
reasons. First, it has been suggested in several p'>es recently that some archacolog- 
ical remote sensing enthusiasts may have oversold the potential of this body of 
techniques and methods (Dowman 1980, 1983; Dunnell 1980; Evans 1983a, 1983b; 
Fuller 1983; Whimster 1983a, 1983b). Second, it is necessary to emphasize that the 
limitations of any measuring technology are dependent on the conditions under 
which it 1s employed, and that the failure of techniques to reach their full potential 
in one situation does not mean that they will always be less than useful. 


Limitations in archaeological remote sensing can be the result of many factors. 
They may be inherent in the sensing systems themselves; the scale and spatial 
resolution of data provided by a system impose limits on what can be seen or 
analyzed. Lenses, shutter speeds, scanning rates, and the speeds and altitudes of the 
platforms that bear remote sensor devices can impose restrictions on the usefulness 
of data for specific purposes. Spectral resolution is another important system 
limitation, and for any purpose it 1s important to determine just what portions of the 
electromagnetic spectrum should be measured before remote sensor data are 
collected. Photographic sensors image only a small portion of the electromagnetic 
spectrum, but they possess much higher resolutxon than most multispectral scanner 
systems. 


Instruments available for laboratory analysis may impose another set of limita- 
tions on the application of remote sensor data to archaeological problems. While 
acceptable pocket stereoscopes can be purchased for about $30, more useful stereo- 
scopes can cost thousands of dollars and may not be available to all archaeologists. 
Digital image-processing systems are even more expensive, although it may be 
possible to rent time on such systems. Several examples of less-than-optimum 
digital image analysis being apphiec *o “predictive modeling” in an attempt to save 
money will be summarized later in this chapter. In many if not most cases, 
archaeologists who wish to incorporate remote sensing methods into their projects 
will do better to contact qualified and well-equipped archaeological and cultural 
resource remote sensing consultants, rather than to entertain notions of doing their 
remote sensing work “in house.” 


Environmental factors impose another sort of limitation on archaeological 
remote sensing. Clouds, mist, and haze can obscure the view of most sensor 
systems; heavy snow or vegetation cover may also defeat some systems (multispec- 
tral scanners and photographic sensors) but have little effect on others. (Radar, for 
instance, penetrates vegetation canopies with relative case.) The phenomena 
recorded by some scanner systems, in particular thermal scanners, are transient and 
often can be detected only for a few hours or even ininutes when conditions are 
opt.mal; identitying such optemal conditions may take years of experimentation in 
any study area (Perisset and Tabbagh 1981; Tabbagh 1977) 
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Individual human limitations, such as ability or mability to perceive stereo 
images, experience in photointerpretation, previous familiarity with a study area, or 
knowledge about what vanous characteristics of the environment look like, can 
affect the application of remote sensing to any problem area. In general, in order for 
a researcher to apply remote sensing to a problem successfully the problem must be 
stated explicitly; the place of remote sensing in the solution of that problem must be 
defined; and appropriate methods for discovering, collecting, and analyzing the 
archacological (dependent) data that are to be contrasted with environmental 
(independent) vartables must be selected. As we will see when we examine the ways 
in which archacologists have applied remote sensing to the problem of predicting 
the locations and characteristics of archaeological materials, failure to meet these 
conditions may be one of the most obvious reasons for the lack of satisfying 
conclusions. It may, i fact, explain much of the present lack of success in predictive 
modeling in genera!. Again it should be emphasized that the specialized technology 
of remote sensing -—and the problems it can or cannot help the archacologist to 
solve —are best assessed and implemented through a team approach incorporating 
not only in-house cultural resource management and archacological personnel, but 
a specialist in archacological and cultural resources remote sensing as well. 


CONTEMPORARY APPLICATIONS OF REMOTE SENSING 
TO ARCHAEOLOGICAL PROJECTION AND PREDICTION 


A Taxonomy of Predictive Archaeological Remote Sensing 


In a previous publication (Ebert 1984:341) 1 proposed a taxonomy that distin- 
guishes between archacological sampling, projection, and prediction. But taxono- 
mies are problem-specific, and the problem that I was addressing in this previous 
publication was the application of remote sensing to survey archacology as a whole. 
The purpose of thas chapter is somewhat different: it deals specifically with remote 
sensing applications to projection and or prediction. As is evident in Chapter 4 of 
this book, I think it is probably most productive to view prediction, here, as an 
integral part of the explanatory framework of archacology (see Figure 4.1), as 
something that archaeologists must do to draw testable expectations from model| 
that describe the ways in which we think the archaeological record 1s related to the 
organization of past human systems. The term projection has been used im the 
taxonomy in Chapter 4 to designate empirical generalizations about the occurrence 
of archacolopical mate‘ials in unsurveyed or unsampled areas on the basis of known 
distributions in surveyed areas. Because lax definitions can lead to problems in any 
scientific endeavor (see the Chapter 4 discussion of the ste concept, for example), 
the definitions of projection and prediction set forth in Chapter 4 will be used here, 
rathez than those | proposed earlier (in Ebert 1984). 
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Another theme of Chapter 41s that we almost certainly do not know how todo 
successful predictive modeling, in the sense of being able to make generally 
applicable statements about the location of archacological matenals in unsurveyed 
areas, at the present time. What is more, we do not know exactly why we cannot 
successfully predict in a general way. For this reason, it 1s my belief that almost all of 
the archacological “predictions” that have been attempted, by archacologists in 
general and by those employing remote sensor-derived data, are probably projec- 
tions in the sense that this term 1s used in Chapter 4. 


The following discussion of approaches that have used remote sensin7 data to 
generate proyections 1s arranged according to a taxonomy that emphasizes difleren- 
ces in (a) the things that archacologists want to predict and (6) the remote sensing 
analysis method employed. 


The first taxonomic category comprises approaches that generalize from 
extant archacological and environmental data about areas mn which archacological 
materials are likely to be found but consider only peripherally where materials will 
not be found. Such approaches could be thought of as prospecting, and their goal is 
to streamline the discovery of archacological materials in order that those materials 
may be studied or preserved. The two basic analytical methods that have been used 
by archacologists engaged m this sort of projection are visual analysis and digital 


analysis. 


The second major taxonomax category consists of approaches to archacological 
projection that use remote sensing to identify areas where archacological materials 
can be expected and areas where they are not expected to be found. In effect, these 
approaches lead to projections of the differential densities of archacological ma;r- 
rials in a study area or, in some cases, densities of specific types of materials. They 
can also be used to design sampling stratifications that are intended to provide this 
type of density information. Again, a distinction will be made between approaches 
that use visual analysis and those that use digital analysis. 


What follows is a review of the literature concerning archacological projective 
attempts incorporating remote sensing data, organized by shese taxonomic catego- 
ries. The successes and failures of these approaches will be discussed once the 
summaries have been presented. 


First, t should be pointed out that the distinction between these two methods 
is really technological—people are involved in making decisions whether the 
processing 1s done by the human brain or partly by a machine. There are, however, 
some basic quantitative differences between visual interpretation of environmental 
variables and digital analysis. One of the most obvious of these is that people 
generalize when they interpret things from remote sensor data, such as aerial 
photographs or Landsat visual umages. A large, relatively homogencous area of (for 
instance) pune forest is identified as such by a human interpreter, and tiny inclusions 
of oak are sgnored. In the course of a computer digital analysis, on the other hand, 
each pixel is classified, and if an oak pixel falls within a mass of pifton pixels, it is 
classified as oak forest. In many cases, there is nothing wrong with or unworkable 
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about the generalizations of human interpreters; the presence of a few oak trees 
within the pinon forest probably constitutes environmental variation at a scale 
incomparable with the scale of human mobility and systems organization. For other 
purposes—in computing an environmental diversity index using a moving filter 
across space, for instance —the ungeneralized, digital classification may be the only 
workable data representation. 


It has been asserted (Baker and Sessions 1979) that digital analysis is superior to 
human visual interpretation because human biases are not injected into digital 
products and because digital analyses are replicable. This ts a somewhat optimistic 
interpretation of what digital analysis actually entails. In human interpretation, 
subjective decisions are made about where boundaries fall, while in digital analysis 
subjective, human decisions must be made about the limits of caster boundaries (in 
multidimensional analyses) or about the confidence limits one is willing to accept as 
representing useful correlations between the occurrence of cultural and environ- 
mental variables. The meaning assigned to interpreted or digitally derived variables 
1s subjective in both cases. 


Nonetheless, a distinction will be made below between those “predictive” 
attempts using visual intespretation and those using digital analysis. This is done 
for the most part for historical reasons, as visual interpretations for archacological 
purposes were attempted earher than machine-processing-based attempts. Digital 
processing can be cost-saving when large geographic areas are being inspected, and 
digital-format predictive products are also easier to incorporate into geographic 
information systems. For these reasons, digital-format products are likely to be the 
mayor thrust of remote-sensing-aided archacology in the future. 


Archaeological Projection Through Visual Analysis of 
Remote Sensor Data 


Archacologists have been using remote sensing, particularly aerial photoiter- 
pretation, for the discovery and inspection of sites since the carly 1900s (Beazeley 
1919; Capper 1907; Lindbergh 1929). Especially in Great Britain and Europe, most 
archacological uses of aerial photographs are still directed toward actually seeing the 
manifestations of sites and structures through shadow or crop marks (Riley 1980, 
1982; Wilson 1982). The examples of “prediction” of areas likely to contain positive 
archacological evidence discussed here, however, are somewhat more indirect. In 
these examples the experimenters seek not to see actual sites but rather to correlate 
the distribution or occurrence of known archacological materials with certain 
landform and environmental characteristics. These independent variables are then 
sought im areas that have not been archacologically investigated, and uninvesti- 


gated areas exhibiting such properties are postulated to have a high likelihood of 


contaming archacological materials. In these studies, remote sensing typically 
provides the basis for characterizing the environmen: in areas known to contain 
sites as well as for finding unsurveyed areas with the same characteristics. 
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One of the earliest predictive studies of this type was carned out in lowa and 
used aerial photographs to define and map soil types that were thought to be ideal 
for the agricultural subsistence practices of the mound-building people who occu- 
pied the area shortly prnor to European contact (Tandarich 1975). Soil types were 
photointerpreted and classified according to Department of Agnculture cntena, 
and those soil types that had been found to be associated with mound sites in the 
past were further interpreted in stereo to find mounds. 


Another early predictive prospecting study made use of Landsat (ERTS-1) 
visual multispectral scanner (MSS) data. In this study, Cook and Stringer (1975) 
attempted first to see the actual manifestations of large, known, historically aban- 
doned village sites in a boreal forest around Kaltag, Alaska, but they were unable to 
determine whether the spectral signatures they saw indicated the villages them- 
selves. Then, by characterizing the landscape and vegetation in the vicinity of the 
known village sites, they attempted to predict the potential presence of additional 
villages in other parts of their 450 mi? study area. They felt that they were able to 
relocate 5 of the 12 known village sites, and they also predicted a number of other 
potential sites, although these were not field checke :. 


A similar though more rigorous method was adopted by archacologists at the 
National Park Service’s Remote Sensing Division in a study directed toward 
locating areas within Shenandoah National Park in Virginia, which had a high 
potential for prehistoric and historical archaeological site occurrence (Ebert and 
Gutierrez 1979a, 1979b, 1979c). One impetus behind this study was the desire of park 
personnel to find exemplary archacological sites that could be used im interpretive 
programs. Additionally, this experiment was undertaken in an attempt to show that 
aerial remote sensing could be of value in the eastern deciduous woodland; a 
persistent theme in critiques of archacological remote sensing 1s that it is only useful 
in the anid Southwest, where sites can be seen because of sparse vegetation cover. In 
the Shenandoah project it was not sites themselves that were seen, but rather then 
settings. 


The first step in this project was the selection of environmental indicators 
(Ebert and Gutierrez 1979b:7), which were chosen not because of any assumed 
preferences on the part of prehistoric and historical occupants of the area but rather 
because these environmental characteristics could be photointerpreted from 
1:12,000 scale color transparency aerial photographs of two areas of the park. Values 
for the variables of slope type, slope angle, slope aspect, vegetation type, vegetation 
diversity, soil thickness, type of surface deposit, bedrock type, and proximity to 
coritacts, faults, and shear zones were formulated, and recognition criteria for each 
value were explicitly identified. 


The next step in the project was to mark the exact locations of previously 
located historical and prehistoric sites on the aerial photographs. In no case could 
the site itself be seen, but topographic factors allowed map locations to be trans- 
ferred to the photos accurately. Within an arbitrary radius of 250 ft around each 
known location, the environmental indicators were interpreted using a Bausch and 
Lomb roll-film stereoscope with magnification up to 20«. The results of this 
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interpretation were tabulated and coded, and the characteristics of places where 
sites were likely to be found were summanzed. Some successful “indicators” of sites 
were slope angles from 0 to 5°, exposures of southeast to southwest, and proximity 
to fault or shear zones; historical sites differed from prehistoric ones im that most 
known historical sites occurred on locally flat areas on sideslopes and in colluvial or 
alluvial deposits. 

Finally, the aerial photographs were reinterpreted to locate areas that exhib- 
ited these site-likely indicators but had not been surveyed for archaeological 
materials in the past. A field check at Shenandoah revealed the presence of 
previously unrecordea archacological materials, some of a spectacular nature 
(including a large nineteenth-century mill site; Figure 9.1), nm 45 percent of the 
projected “likely” areas. One obvious weakness of this study was that no unlikely 
areas were field checked to test the reyection potential of this proyection. 























Figure 9.1. An eghteenth-century mull or wndustrial ste discovered im Shenandoah National Park, 
Virguma, through remote-senseng-aded archacological proyection. The existence of this complex was known 
from tax records, but its lecatvon was not penpomted wntd field checking of “probable ste areas” derwed 
through the analyses of 112,000 scale color wntrared aerial photographs of portions of the park was mutated (after 
Ebert and Guteerrez 1981) 
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In another projective study carned out during the same year, 1:60,000 and 
1:120,000 scale color infrared transparency photographs were inspected for mndica- 
tors of site occurrence im the National Petroleum Reserve-/ laska on Alaska’s North 
Slope (Gal 1979). Gal felt that, while there was little hope of actually scemng village 
sites themselves, there was more potential for “identifying areas where sor to look 
jor archacological stes and areas of high archacological potential” (1979-1). He 
sought such indicators as the consistent appeasance of whaling lances and areas 
where carly melting of snow and ie provided locations desirable for springtime 
camping grounds. Areas with known archacological sites appeared to be hghter mn 
color than surrounding areas im the color infrared acral photographs, which were 
taken m July; Gal behewed that thes mndicated better-drained places where vegeta- 
tion flourished but died off first. Gal concluded that such studies held great 
potential, especially in the Arctic where ground reconnaissance us expensive and 
difficult and where “narrowing down” of survey areas us virtually necessary. 


Two studies that followed the lead of the Shenandoah projection experiment 
were also undertaken mm the Eastern forests by archacologists from the National 
Park Service's Southeastern Archacological Center. Inspection of color infrared 
acral photographs, which make « relatively casy to recogmize the distinction 
between water and land, provided a preliminary indication of where to conduct 
archacological surveys mn the Big Cypress Swamp im Flonda (J. Ehrenhard 1980). In 
such areas, of course, human occupation takes place only where there is no standing 
water, a critenon that restricts “site likely” areas severely. By noting the locations 
of smail mounded areas surrounded by sawgrass and water, archacologists were able 
to narrow down thew survey efforts to a very small percentage of the total area 
encompassed by Big Cypress Swamp. A more complex series of indicators mter- 
preted from aerial photographs, including topographic, hydrologic, and soil pro- 
ductivity variables, were correlated with different temporal and functional charac- 
terstics of a sample of previously known archacological materials m the 
Chattahoochee River Recreation Area; the resultant model proved to be successful 
in locateng sites from different time periods (E. Ehrenhard 1980). 


Digital Approaches to Archacological Projection 


A digital approach to detecting and analyzing the “residual effects of prehis- 
tore human settlement upon landscapes” was undertaken im the late 1970s m 
southwestern New Mexico mm an attempt to locate Animas phase pucblos tor further 
study (Findlow and Confeld 1980:31). Landsat MSS computer compatible tapes 
(CCTs) were analyzed at Columbia University using Map | software. The spectral 
characteristics of “catchments” of 32 by 40 pixels (about 1200 acres), 16 by 20 pixels 
(about 300 acres), and 8 by 10 pirels (80 acres) centered on 8 large (100-500 room) 
Animas phase sites and 33 randomly selected pots that had not been previously 
surveyed were compared using analysis of variance statistics. Findlow and Conteld 
concluded that soil and vegetation were darker around site areas than m nonsite 
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catchments (F sgure 9.2) and that these diflerences were particularly obvious in the 8 
by 10 prel examples. The lower reflectance was attributed to greater momsture 
retention and to the existence of cultural debris in middens surrounding che large 
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It ss difficult to determine whether the above example actually constitutes a 
case of “seeing sites,” or whether it was the contexts of the sites that were being 
detected. Another projective study utilizing remote sensing that evokes the same 
question has been pursued since the carly 1980s by K.-Peter Lade at Salisbury State 
University un Maryland (Lade 1981, 1981b, 1982). Using Salisbury State's ASTEP I 
software for the analysis ot Landsat MSS data, Lade examined the land cover and 
geology of a 34,000 km? study area (an eniure Landsat scence), classifying pixels on the 
basis of “angular distance relationships observed m vector space of normalized 
data” (princepal components analysis; Lade 198ia:13). He found that dry, sandy 
ndges were casily discerned through density shcing of Landsat’s band 7 (Figure 
9.3), and that such mdges were usually entirely covered with cultural materials 
representing multiple occupations through long time periods. It is not clear 
whether Lade us identifying the effects of such occupation or a particular landform 
type conducive to occupation (or to finding materials —sand ndges typically have 
discontinuous vegetation cover), but nis projections are undeniably successful at 
finding site-hkely areas. 


A more mgorous prospecting approach, which was also carned out m an eastern 
coastal plain and piedmont setting, is Wells's (1981) study, which ts explicitly based 
on discrmunation of landscape features. Wells selected “predictive environmental 
variables” (1981:22), including distance to water, specific soil types, and specific 
geomorphological and topographic settings, as well as known archacological site 
locations, and subjected these variables to a logistic regression. His results were 
tested by field-inspecting both site-likely and site-unlikely areas. Although Wells 
primarily used information derived from map-based geographic information sys- 
tems, based on photoiterpretation by others, he discusses at length the potential 
for automatic proyections of this type using Landsat MSS data. 


An approach similar to the ecarher Shenandoah experiments was used im 
Kentucky by Carstens et al. (1981). Stereo photomterpretation of 1:20,000 aerial 
photographs was performed by a number of independent mterpreters, and the 
results consisted of codings of the characteristics of landforms and vegetation cover 
im a 400 by 400 m grid overlam on the photos. The same exercise was then 
undertaken useng 1:7920 scale photographs and a 100 by 100 m grid overlay. The 
smaller grid overlay mterpretations proved to be more useful for sdentifying areas mn 
which archaeological sites were found (using a previously known sample that 
presumably had ser been inspected by the interpreters prior to thei imspecticn of 
the photographs), resulting mm the recognition of 13 of 19 known sites (68 percent 
aceucacy). A freld check revealed that additional, previously unknown sites could 
actually be found mm 78 percent of the projected hkely grid cells. Another study 
following almost the same methodology but using photomosasx (monoscop« ) 
mterpretation rather than stereo photomterpretation (Haase 1981) predicted site 
densities on Cedar Mesa, Utah, with more variable results 


An ongewng proyective experment using known locations of Gallo-Roman 
villas in the Burgundy region of France (Madry 1983, 1984) 1s especially teresting 
m that t moorporates modern digital analysis m an area that had until his study 
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been a stronghold of the “seeing sites” approach to remote senung; usually acrial 
photography trom hght aurcraft served only as a perspheral means of documenta- 
thon. Madry analy zed two Landsat 2 MSS scenes using the General Electric “Image 
100” system, with known villa locatsons serving as a tramang sct (that 1, the spectral 
characteristics of known villa locations served as “mstructions™ to the computer, 
whoch then selected other areas with semalar spectral sagnatures). He has con. ‘uded 
thus far that the resolution of the Landsat data (80 by 80 m pixels) us too grows to 
allow new villa sites to be found, although relatively mtense and contrnuous land 
use since the Gallo-Roman occupations may also Se a factor m hes lack of success 


All of the approaches summarized above share a number of common charactet- 
mtics. Their primary goal s the sdentification of areas that are hkely to contain 
archacologxal matenals, based on characteristics of the locations of previously 
discovered sites. While they are strictly empirical, these studies are also “postive” 
m that thei goal 1s to find sites or materials for archacologyx al study. Thei results 
cannot, therefore, be casily converted mto statements abou areas where sites and 
materials gull wer be found and thus areas tha <an be ygnored for clearance of study 
purposes 

The more expheitly “predictive” studies that wall be discussed below are, for 
the most part, extensions of these proyections. Although such extensions are useful, 
the resultant models are no more explanatory than the correlations on which 
prospecting for ste-lkely areas are based 


“Predictions” of Site Occurren ¢ Nonoccurrence or 
Site Densities Based on Remote Sensor Data 


Unhke the prospecting approaches to predicting likely areas mm which to find 
undiscovered archaeological materials discussed alove, the avowedly “predictive” 
remote sensing approaches to the archacological record that are summarized below 
are directed toward identifying areas of differential occurrence of sites withen 
regrons. Such diflerences may be expressed mm terms of sites vs no sites (of Nomsites), 
differential densities of sites, or variation mn densities of more than one temporal, 
cultural, or tunctional site type between zones within a study or management area 
Nonetheless, the discovery of these differences 1s approached m essentially the 
same Way as was site “prospecting” on the sectron above. The locations of known 
sites, or of different types of sites, are tabulated from previous survey data; the 
study area ss then divided through either an arbitrary stratification (¢.g., grid cells) 
of an unformed stratification (environmental zones of one sort or a other). Through 
one of a number of statustical techmaues, the physical location «here sites pre- 
viously have been found are correlated with the physical locations of mndependent 
environmental vanables (see Chapters 5-8 of this volume). On thes emmpoerical bases, 
proyections are made about where sites will or will not be townd and about the 
densities of sites mn general or of different types of sites m areas where the 
archacological record ws not known 
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Clearly, these are not predictions in the explanatory sense set forth in Chapter 
4; they are inductive, empirical generalizations or projections. Some of the implica- 
tions of this fact for the utility of such predictions are discussed in a subsequent 
section of this chapter, along with ways of going beyond correlations of archae slogi- 
cal manifestations and environmental variables. 


Archaeological Prediction and Visual Interpretation 


To my knowledge, the first “predictive”’ archaeological attempt utilizing 
remote sensing as a major environmental data source was in:tiated in early 1977 as 
part of the National Park Service Alaska Area Office’s cu!turai tesources assessment 
of the National Petroleum Reserve in Alaska (NPRA). This experiment was origi- 
nally envisioned simply as an exercise in sample design; it has obvious implications 
for remote-sensing-based archaeological prediction, however, and in fact the 
methods used were incorporated in almost identical form into an avowedly “‘predic- 
tive model” of site densities in the San Juan Basin of New Mexico that was carried 
out by the National Park Servi. shortly after the Alaskan project was completed. 


The Natsonal Park Service was asked to c nduct a reconnaissance of the 
NPRA, which covers about 23 million acres (92,000 km?) of Alaska’s North Slope, by 
the Bureau of Land Management prior to the opening of the area to virtually 
unrestricted petroleum exploration. The ideal would have been to survey a repre- 
sentative sample of the whole project arca, but this was nearly impossible given the 
short, 8-12 week summer field survey season; the general inaccessibility of the 
study area; the impossibility of land transportation during times when the ground 
was noi snow-covered; and the tremendous area to be covered. Although the North 
Slope appears almost featureless on atlas maps, it extends from the peaks of the 
Brooks Range across foothills and a sand-mantled upland region to the poorly 
drained coastal plains. The great environmental variability and logistical problems 
of surveys in the Arctic meant that any sort of successful appraisal of the nature and 
distribution of archaeological materials in the NPRA would require careful sampling 
and planning, and the National Park Service’s Remote Sensing Division in Albu- 
querque was asked to provide assistance. 


The most interesting potential use of remote sensor data for this project was as 
a basis for sample stratification and design. A sample design was created, based on 
an informed stratification of the NPRA into a relatively small number of “‘ecologic- 
cover type zones” (Brown and Ebert 1978; Ebert 1978; Lyons and Ebert 1978) 
determined on the basis of visual interpretation of Landsat MSS color composite 
images. 


An initial ecologic cover-type stratification was compiled through visual 
interpretation of 10 Landsat scenes and mapped on a 1:500,000 scale base map of the 
NPRA (Figure 9.4). Subsequently, 1:60,000 and 1:120,000 scale color infrared aerial 
transparencies, which present a spectral picture nearly identical to that of Landsat 
color composite MSS visual images, were used as a preliminary check on the 





447 








uP 





PRELEMMARY ECOLOGK (COVER TYPE MAP OF PRA 


—f. > y UTLQNG LANDSAT SFRARED SOAGERY 
G — | 
yee oaanemam 
_ ; 
LEGEND | 
| 
. ~—~—s 
. — ee 
. _ -—- 
* -_-_ - 





Figure 9.4. An ecologi cover-ty pe map of the National Petroleum Reserve m Alaska (NPR-A compiled with the aid of interpretation of Landsat color 
composite visual data. The seven cover types, which are composites of hy drologx:, vegetative, and topographic indicators, proved to be successful in the 
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Landsat-derived classification, which defined seven zones based primarily on inter- 
preted drainage and vegetation-cover differences. Although the small-scale aerial 
photographs could also have been used to stratify the NPRA for sampling, this 
would have required the interpretation of literally tens of thousands of frames, a 
practical impossibility. The second level of ground truthing consisted of an exami- 
nation of zone boundaries across the NPRA from a helicopter platform, which 
illustrates the pomt that ground truthing does not always have to be done on the 
ground. Additional oblique aerial photographs were taken during this helicopter 
examination, using 35 mm cameras and color infrared film for documentation 
purposes (Ebert 1980). Finally, observations of boundanes and the vegetative 
composition of each ecologic cover-type zone were made on the ground. 


Although it had been assumed that this stratification would be used to select 
those areas in which survey would be carned out by NPS field crews, conflicting 
ideas about the goals of the survey prevented this from being accomplished. By the 
tume the preliminary stratification had been completed, the NPRA survey crews 
were already in the field and had already selected areas to be surveyed based on 
potential site densities—that ws, areas that were believed, on the basis of past 
expenence in the Arctic, to be likely to contain concentrations of archaeological 
sites were chosen for reconnaissance. As pointed out previously, this is a valid 
approach if it is the highest concentrations of spectacular sites that one 1s seeking, 
and in fact the major outcome of the NPRA cultural resources assessment was the 


setting aside of a number of National Register districts with high concentrations of 
archaeological materials. 


Even though the remote sensing sample stratification was not used to select 
survey areas, the discriminatory power of the sample stratification was tested using 
the data that were collected. The approximate boundaries of the surveyed areas 
were marked by the survey crew leaders on 1:250,000 topographic sheets, and the 
survey areas were then carefully stratified, using detailed versions of the ecologic 
cover-type zones discussed above. The area of each stratum actually surveyed was 
measured with a digital planimeter and compared with the numbers of the types of 
sites discovered during that survey. On the basis of this information, purely 
empirical “predictions” of site density within particular strata were made. 


The second season of survey, carried out in the summer of 1978, was also 
conducted without reference to the ecologic cover-type sample stratification. Por- 
tions of four strata that were partially covered during the first season (summer of 
1977) were also surveyed during this second reconnaissance. A comparison of site 
densities in these strata between the two field seasons 1s interesting (Table 9.1). 
The striking differences may be a result of variations from place to place within the 
NPRA in the effectiveness of the ecologic cover-type stratification. Alternatively, 
these differences may reflect changes in the ways things were sought in the field, in 
the experience and expectations of the crew in successive summers, and in the ways 
that sites were recorded. Moist tundra, where the lowest densities were found in 
both seasons, 1s typically covered by dense grass tussocks, and none but the most 
obtrusive archaeological materials can be found there. The “brush” stratum occurs 
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TABLE 9.1. 
Snes located by the NPR-A cultura! resource surveys, 1977 and 1978 





Siralum Area Survey’ tm Nawoher of Site Sue om 


SUMMER 1977 


Mosst tundra $28.85 is 0.079 
Alpine tundra mont tundra 536.83 Ps 0.253 
Brush most tundra 117.73 2» 0.24 
Bare rock gravel aM 7h 0.803 


SUNIMIER 1978 


Mort tandra w 48 12 0.212 
Alpine tundra mowst tundra 219.42 iW 0.500 
Brush mort tundra %.9) 73 094 
Bare rock gravel 4.42 \s 0. 389 





along rivers and lakeshores, and 11 1s in this zone that large village sites are usually 
found, probably because of the availability of firewood. Sites do not occur every- 
where within brush areas, however. The brush cover must occur in conjunction 
with one or more of a number of geographic situations (caribou crossings, nver 
confluences, the windward side of lakes, etc.) if the hkelihood of finding a site 1s to 
be increased. The survey crew may well have learned to identify these combina- 
tions of factors, which would account for the dramatic increase in identified “brush” 
sites during the second season. Another possibility 1s that, while the “brush” strata 
in which survey was carned out in the first and second survey seasons were of the 
same composition, other properties of the strata, such as distances to boundaries or 
sizes of portions of this stratum, may have been different (Michael Garratt, personal 
communication 1985). This mght underline the appropriateness of attempting to 
derive diversity or heterogeneity measures from remote sensor data tor predictions, 


a topic that wall be discussed at length later in this chapter. 


At about the same time that the National Petroleum Reserve im Alaska cultural 
resource project was winding down, the National Park Service’s Southwest 
Regional Office became involved in studying cultural resources as part of another 
multi-agency impact assessment, the San Juan Basin Regional Uransum Study in 
northwestern New Mexico. The Bureau of Indian Affairs, which administered the 
study, requested that the NPS Southwest Regional Office study the potential 
impacts of uranium mining and associated development on the cultural resources of 
this 100 by 100 mi area. 


The primary task undertaken by the National Park Service tor this purpose 
was the consolidation, in consistent format, of all available archaeological survey 
data from some 4000 known surveys that had taken place in the San Juan Basin, a 
herculean task in itself. Extensive data on more than 16,000 sites were compiled and 
recorded on computer media, and software was devised to make access to any aspect 
of these data simple and economical. These data have formed the basis of a wide 
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range of assessments and discussions of the archaeology of northwestern New 
Mexico and the dangers threatening these resources today (see especially Plog and 
Wait 1982). 


As part of this impact assessment, the National Park Service’s Remote Sensing 
Division was asked to attempt to predict distributions of sites—to “make some 
statement about the distribution of archaeological sites throughout the Basin” 
(Drager and Lyons 1983:2) using remote sensing data and methods. An approach 
virtually identical to that of the NPRA remote sensing sampling design was adopted 
for the San Juan Basin project. An ecologic cover-type stratification was prepared 
through the visual interpretation of Landsat MSS color composite visual images, 
based on the methods used in the NPRA (Camilli 1984; Drager 1980a, 1980b; Drager 
et al. 1982). Exghteen different cover types with 22 additional subtypes were defined 
for this area, which ts environmentally far more complex than Alaska’s North Slope 
(Figure 9.5). In addition, these cover types were cross-correlated with eight land- 
form types (see Drager and Lyons 1983 for details). The resultant zones were 
mapped on a 1:250,000 scale base map. Other information aiso examined for the San 
Juan Basin included surface geology and average annual precipitation. 


The first step in making projections about site densities was to overlay 2 by 2 
km grid squares to code the previously surveyed areas onto an ecologic cover-type 
map of the basin. Surveyed squares that comprised more than one ecologic cover- 
type zone were eliminated. Numbers of archaeological sites found within each zone 
in the course of previous surveys were then determined by searching the computer 
data base. For each zone, the total number of sites found was divided by the area 
surveyed to calculate a density figure. The number of sites in each zone was then 
predicted. Previous archaeological surveys had only been conducted in 21 of the 40 
zones subzones detined during ecologic cover-type mapping, and predictions were 
made only tor these zones. Still, some 51,700 sites were predicted to be present in 
these zones, a sizable (and perhaps unmanageable?) number. 


Several other projective experiments in New Mexico, all based on the metho- 
dology used in the NPRA and San Juan Basin projects, have been reported in the 
literature (Camualli 1979a, 1979b; Camilli and Seaman 1979; McAnany and Nelson 
1982), and an additional experiment in predicting site densities across ecologic 
covet-type, surtace geological, and soils zones (ali based on remote sensing) has 
since been carried out by the Remote Sensing Division (Drager and Ireland 1986) as 
well. All of these approaches exemplify the ways in which an area can be stratified 
into different and often empirically significant areas or strata for sampling or for 
empirical proyection trom known site distributions to the distributions of sites in 
areas not yet surveyed. All suffer the same deficiencies exhibited by other empirical 
correlative “predictive” schemes: they are not explanatory, and their success or 
failure at prediction —even if “tested” —cannot be accounted for. 


An additional problem of projective or predictive experiments based on the 
use of archaeological data from many surveys should be mentioned briefly here. 
Although many states or regions of this country have well-developed data manage- 
ment or geographic information systems from which great volumes of survey data 
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Figure 9.5. San Juan Basic ecologic cover-type zones delineated through the mterpretation of 
Landsat MSS visual data produced m an attempt to proyect archacological site densities and the 
differential locations of archaeological site types mn northwestern New Mexico (Camilh 1984:Fig. 4). 
The interpretation methods tollowed un this effort were essentially the same as those used im the 
NPR.-A interpretation shown im f.gure 9.4 
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can be recovered, all archaeologists are aware that survey data are often inconsist- 

ent. The goals of archaeology have changed in a broad sense through time, and 

many sorts of information recorded today were ignored in the past. Survey data 
seldom contain any useful indication of survey intensity, a factor that 1s all- 
important for judging the completeness of recovery and representativeness of data 
collection (see discussion in Chapter 4). As discussed earlier, geomorphological and 

climatic conditions may be as important in determining what ts found during 
surface surveys as what is actually there; most survey data do not provide informa- 
tion on these factors, either. In predictive experiments that utilize data from many 
different surveys, it may primarily be variations in survey quality, rather than the 
characteristics of the actual archaeological record, that are being measured. Some 
ideas about how archaeologists might deal with this problem are presented in 
Chapter 7. 


Archaeological Prediction Through Digital Analysis 


A final class of “predictive” experiments utihzing remote sensor data makes 
use of the computer analysts of digital remote sensor data—either digitally recorded 
Landsat or other satellite data or, in a few cases, analog (photographic) images 
converted to digital form. Digital analyses can take a number of forms, distin- 
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guished on the basis of the degree of explanatory meaning imparted to the results of 
the analysis. Some of the digital analyses presenicd here purposely avoid making 
statements about mhy archacological sites or materials are found in specific areas or 
im conjunction with certain spectral value ranges, cating instead the “objectivity” of 
automatic digital processing. As discussed above, this assessment 1s not completely 
realistic, for in any kind of computer processing of image data, decisions about 
cutoff points for clustering or correlative analyses must be made subjectively. 


In order to make this point clear, a typical digital image analysis procedure — 
clustering analysis— will be summanzed bnefly here. The goal of digital clustering 
analysis 1s recognition of recurrent spectral patterns across multiple bands of a 
multispectral image (such as a Landsat MSS scene or subscene). Different sorts of 
phenomena on the earth's surface — plants, water, bare soil, or rocks, for instance — 
reflect electromagnetic radiation differentially across multiple spectral bands. For 
example, water absorbs almost al] infrared radiation and appears black in Landsat 
infrared bands and lighter in the red and green bands; bare soil reflects highly 1 all 
four bands; and growing vegetation reflects infrared radiation but absorbs hight in 
the red band. Digital clustering analyses examine the differential values of each 
pixel in more than one band and group them into clusters on the basis of subjec- 
tively determined cutoff values. 


A digital analysis can be esther supervised or unsupervised. In a supervised 
classification, a hursan operator directs the computer analysis by specifying a 
“traiming set”’ of areas that represent each desired cover type class to be discrimi- 
nated. The computer then attempts to fit the spectral vanability within the data 
into these clusters (not always successfully). In an unsupervised classification, the 
computer discriminates clusters only on the basis of arbitrary cutoff values that 
draw boundaries between clusters of values in »-dimensional space (where # is the 
number of spectral bands used in the analysis). There are several kinds of cluster 
cutoff boundaries that can be used, including minimum distance to means classifiers 
(which measures between-cluster centroids), parallelepiped classifiers (which con- 
sider the range of variance im a traiming set), and maximum likelihood classifiers 
(which evaluate both the variance wit hin classes and the correlation between them; 
Lillesand and Kiefer 1979:457 487). The machine then tells the operator how many 
classes it has bounded, and the operator must decide what is actually being 
represented by each class. Following unsupervised classification, classes are usually 
collapsed into fewer classes by the operator, and these aggregate classes are named 
according to what the operator thinks they represent. Subjective decisions 
obviously enter into each type of cluster analysis, and the interpretation of what the 
results of such an analysis mean is always subjective as well. The actual composition 
of each area can be ground-checked and can also be compared with values of 
dependent variables (archaeological site densities, in most of the examples summa- 
nzed in this chapter), but the reasons for the correlation between environmental 
and cultural variables are not obvious. 


An example of a remote-sensing-assisted predictive approach based on cluster 
analysis 1s a study of the archaeology of the Bisti-Star Lake region in northwestern 
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New Mexico, which was done m anticipation of large-scale coal mining (Kemrer 
1982). The goal of this project was “to assess archaeological variability on lands 
| designated] tor potential competitive lease coal development” (Kemrer 1982:2). 
This assessment was based on a sample of the total area and involved the imphicit 
construction of a “predictive model” (Kemrer 1982:2). Archaeological distribution 
data were dernved from six previous surveys that had been undertaken in the 
immediate project area. 

Remote sensing was used to generate independent environmental data against 
which to compare the Bisti-Star Lake archaeological sample. Two basic assump- 
tions of this project were that “site locational patterning is strongly related to the 
location of critical environmental resources [and that] it 1s likely that site frequen- 
cies and environmental resources are directly related” (Baker and Sessions 1982:63). 
The critical environmental variables that Baker and Sessions decided to measure 
were soil associations and the presence of washes, which they concluded had not 
changed appreciably since prehistoric times and which directly affect many other 
variables that might have changed, such as vegetation and the distribution of 
animal resources. Landsat MSS data and digital analysis methods were chosen 
because of the size of the study area, the rephicability of digital numencal methods 
as compared to visual interpretations, and the ease of statistical comparison of 
numerical output values with archaeological site densities. An October 1977 Landsat 
MSS scene was chosen, and Soil Conservation Service soil mapping units, super- 
imposed on aerial photographs, were used as training samples im a discriminant 
function analysis performed at the University of New Mexico's Technology Apph- 
cations Center. 


The discriminant analysis, using a maximum likelihood classifier, distin- 
yuished eight soil classes, which “was considered adequate for predictive modeling 
purposes” (Baker and Sessions 1982:66). Based on methods developed during a 
previous predictive study in New Mexico (Baker and Sessions 1979), a2 by 2km grid 
was imposed on the study area, and archaeological and independent environmental 
variables were compared within the cells of this gnd. Archaeological site density was 
correlated with four different, and perhaps overly complex, sets of remote-sensing- 
derived variables, which they describe as follows: 


1. The esght vanables (seven sotls associatrons, plus the category “ washes") output by 
the digital umage analysis; 


2. A second set of exght vanables based on the proportion of prels per grid unt 
classified mto each class; 


3. A set of 28 variables that represent all umque two-way mteractions between the 
environmental classes (classes 12 through 78) with values derved by multiplying the 
number of pixels classefied into cach member «ft each two-class set within each grid unit; 
and 


4. A tourth set also contammg 28 variables representing all umque two-way mter- 
actions between the exght environmental classes, where the proportional number of 
pixels classified into each member of cach two-class set within cach grid unit was 


muituphed to denve values [Baker and Sessions 198204 69) 
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The stated rationale for developing these four rather complex sets of environmental 
variables was that 1t was not known whether cultural variables (1.¢., sate densities) 
would vary with absolute or proportional classified pixel frequencies, and that most 
squares contained more than one environmental class. 


The formula used for modeling site-component densities was a linear equation 
in which observed site densities were taken to be the direct result of summung a set 
of weighted independent environmen.sl variables (Baker and Sessoms 198%2:84). 
Weights and constants were determined through a senes of backward stepwise 
regressions; separate environmental variables were chosen from the 72 variables 
that best correlated with the occurrence of exght temporal cultural classes of sites. 
Regressions were done using squares that contained more than 20 percent total 
survey coverage, those with more than 40 percent coverage, and thase with more 
than 60 percent coverage. The regression with the squares containing more than 60 
percent coverage exhibited the least error, with R? values (“explained vanance™; 
Baker and Sessions 1982:87) ranging from 52 to 86 percent for cach best-fit variable. 
This preliminary model, which Baker and Sessions term Mode I, was used as the 
basis for making predictions about site-type densities for 813 2 by 2 km grid units, 
and 15 of these units were then surveyed as a test of the prediction. 


Based on these results, another regression model was then generated im an 
effort to project site densities more accurately. This model showed smaller average 
error than did Model I, with R? values between 52 and 68 percent. Kemrer (1982-98) 
notes that there are high “correspondences in variables selected between Models | 
and II,"’ meaning that in general those variables that correlate positively with the 
occurrence of archaeological sites in one model do so in the other as well, a pattern 
that holds for a large number of the 72 variables inspected. “ Therefore,” he 
concludes, “It 1s highly likely that the environmental variables are sensitive indica- 
tors of site frequency variations.” 


I would suggest that this correspondence might, instead, be the result of the 
variables all having been artificially constructed from the original exght remote- 
sensing-derived soil and wash classes. Such variables cannot be independent, and if 
patterning exists in the original eight variables then it will also be found im a large 
number of the 72 derived variables. 


Another remote sensing experiment based on the assumption that environ- 
mental factors are significant predictors of site locations was conducted 1m southern 
Colorado by the University of Utah's Archeological Center in order to assess the 
prehistoric and historical archaeological materials along a proposed raslroad route 
(Holmer 1982). In this study, “‘raw”’ pixel data digitized from a visual Landsat umage 
were correlated with the presence or absence of previously discovered archacologs- 
cal sites in parts of the study area that had been surveyed, and predictions were 
then made about the probability of occurrence of sites in those areas not previously 
surveyed. First, each 128 by 128 pixel portion of a Landsat visual image « as digitized 
ot resampled into 500 by 500 m pixels, 22,400 of which were required -o cover the 
entire study area. These pixels were not subjected to a cluster analysis, but rather 
their spectral characteristics were compared directly with site presence vs absence 
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through a discriminant analy sis. The desired result was not determination of group 
membership per se, but rather determination of the probability of group member- 
ship within the group that contained sites; in this way “sensitivity zones | were] 
detined by ranges of probability of ste presence” (Holmer 1982:37-38). A very small 
number of “sate present” pixels — nine with historical sites and 119 with prehistoric 
sites— were used to define the dependent cultural vanable; eventually the histon- 
cal site-present cells were dropped, and only the prehistoric cells were used in 
discrmunant analysis. These cells constituted only 0.53 percent of the study area. 


Discriminant analysis compared site-presence with three spectral bands and 
with the ratios between the red and blue Landsat bands. It was found that site 
presence vs site absence could be best distinguished on the basis of data from the 
red filter band (Holmer 1982:42). The same data were then compared using logistic 
regression analysis, and under this procedure the no-filter data (1.¢., ssmple black- 
and-white density values within each grid cell) were the “best predictor™ (Holmer 
1982:44). Based on the results of the logistic regression, the total study area was 
divided into three gt oups of pixels: those with a greater than 0.275 probability of 
having sites, those with a probability falling between 9.275 and 0.100, and those with 
« site probability below 0.100. These three zones were mapped, and the lowest 
probability zone was classified as the most preferable area tor development (Figure 
9.6). 

Holmer advances a number of conclusions based on this expermment. He 
suggests that the pixel size used in this study, 500 m by 500 m, was excessive and 
that more accurate results would be gasned by using considerably smaller pixels. 
Use of already digitized Landsat MSS data, he notes, would have been p. cterable 
but could not be done given the economic constraints of this project. He concludes 
that logistic analysis 1s an ideal analytical tool for studies of this sort because it 
permits the researcher to incorporate variables of different levels (categorical and 
continuous) into the analysis. Finally, he points out that, although a nonprobabils- 
tue archaeological sample of prior surveys was the basis of this expermment, a 
probabilistic sample would be more appropriate for future studies. 


Another remote-sensing-aided predictive study mm the western United States 
compared archaeological survey data from a 2.1 percent transect survey within the 
Naval Weapons Center at China Lake, Califorma, with vanables derived trom 
resampled, 100 by 100 m pixel Landsat MSS data through a principal components 
clustering analysis of four-band data (Elston et al. 1983). The “mayor objective was 
to develop and characterize signatures for cach transect respective of site con- 
tent,” and thus to arrive at an “independent typology of transects against which we 
can investigate the relationship between transect type and site occurrence” (Elston 
et al. 1983:63). The derived transect typology was displayed as a dendrogram, and 
the number of sites per surveyed transect were “superimposed on the distal nodes 
of the dendragram”™ (Elston et al. 1983:64). The success of this projection was tested 
by arbitrarily selecting 45 more transects, classifying them according to their place 
on the dendrogram through additional Landsat -based cluster analysis, and survey- 
ing them to determine how faithfully the proyection was borne out. 
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Figure 9.6. A site-occurrence probability map of a Colorado study area compiled tor manage- 
ment purposes through digital analyses of Landsat MSS data (after Holmer 1982:Fig 4.5.1) Darker 
shading mundicates areas with high probability of site occurrence; hghter shadmg, areas with medium 
probatulity ; no shading, ateas with low probatulsty 


E'ston et al. tound that their success rate for correctly characterizing the 
probability of site occurrence for transects was 86 percent. They suggest that the 
lower success rates of 58-70 percent achieved by Holmer (1982) and Baker and 
Sessions (1982) were a result of using a two-group—site-present vs site-absent — 
solution, when in reality not all “hkely” areas would have been used in the past im 
sparsely occupied regions. Sites would also, according to Elston et al., be hkely to 
occuf of to not occur im more than one type of environmental setting. They 
characterize their approach as “natural” (Elston et al. 1983266), not a “cookbook 
application of discriminant functions” hke previous projects. The final results of 
their analyses were mapped in three transect classes: those with probabilities for 
site occurrence of less than 0.22, those with probabilities from 0.22 to 0.62, and those 
with probabilities from 0.62 to 0.67 (Figure 9.7), the last of which they term sate Abel; 
The narrow probability range represented by the group of site-hkely transects 1s 
interesting and seems to indicate that there were a significant number of transects 
within this taxon 
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Another group of predictive archacological experiments that made use of 
remote sensing tor the measurement of environmental vanables took place not im 
the arnd West, but mstead in heavily vegetated Delaware. Based on Wells's (1981) 
proposed model of the correlation between ste locations and certain landform 
features (especially sand ndges), these predictive attempts have encompassed at 
least three separate archaeological studies (Wells et al. 1981; Custer et al. 1983; 
Custer et al. 1984). In each study, environmental varnables mnchuded distance to 
water, geomorphological landtorm setting, soil type, gradient, and convexity of the 
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landscape (the two latter variables presumably based on topographic and not 
Landsat data). These vanables were measured within a 3500 m radius of each cell 
(Wells et al. 1981), and a training set of such measurements was used as input toa 
logistic regression comparing site occurrence with each variate. The study 
reported in Wells et al. (1981) resulted in the compilation of a general site occur- 
rence probability map, which was then tested by further survey. This research 
indicated that the relative contribution of each variable to explaining sie occur- 
rence was as follows: 


1. distance to minor stream, 50 percent; 

2. distance to major stream, 42 percent; 

3. distance to openiand soils, 51 percent; 
4. gradient, 51 percent; 

5. convexity, 67 percent; and 

6. distance to present marsh, 12 percent. 


The low contribution of the last variable was explained by noting that most present 
marshes have been drained historically. 


A second study (Custer et al. 1983) compared archaeological survey findings in 
the St. Johns and Murderkill drainages in Kent County, Delaware (Custer and 
Galasso 1983), on a period-by-perniod basis with Landsat-generated environmental 
variables, again using a logistic regression model and the same variables used in the 
previous study. Contour maps showing areas with less than 0.5, 0.5-0.75, and 
greater than 0.75 probabilities of containing sites were generated (Figure 9.8). 
During a second-stage test survey, 37 percent of the inspected areas that had been 
predicted to have probabilities in the 0.5-0.75 range contained sites, as did 49 
percent of the surveyed areas with predicted probabilities of 0.75 or greater. It ts not 
clear whether areas predicted to have less than a probability of 0.5 were tested. 


Another, more comprehensive test of the Delaware models has only recently 
been reported This study took place in New Castle and Kent counties as part of 
planning for a proposed highway corridor (Custer et al. 1984). Detailed explanatory 
site location models —1.¢., theoretical formulations describing assumed past sub- 
sistence and mobility organization — were set forth for each temporal period prior to 
operationalization of the cultural and environmental variables. Environmental 
variables were then devised and measured using the University of Delaware's 
ERDAS 400 digital image analysis system. The authors then used their settlement 
pattern model to predict the distance to each of these landscape features from each 
site type. Using Wells’s (1981) logistic regression method, Custer et al. produced 
contoured probability maps that again showed three probability zones of less than 
0.5, 0.5-0.75, and greater than 0.75. These maps were compiled at |:24,000 scale on 10 
USGS topographic quads, and they are currently being used by the Delaware 
Department of Transportation as planning aids. 


Although Landsat digital MSS data are the most likely source for the remote- 
sensing-aided classification and measurement of environmental variables, there 
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may be other choices in the near future. A recent experiment in assessing cultural 
resources in Bandeler National Monument in central New Mexico Figure 9.9 
made use of simulated SPOT digital data (Inglis et al. 1984). SPOT is a satellite that 
1s soon to be launched by the Centre National d’Etudes Spatiales (CNES) in France; 
an airborne multispectral scanning device was flown by the CNES over selected 
targets in the United States so that scientists could expernment with the three-band 
SPOT data prior to launch. The Bandelier data were acquired on June 19, 1983, at a 
resolution of 20 m, more than twice the resolution of Landsat MSS data, and were 
analyzed using NASA ELAS software on a VAX 11 750 system at the University of 
New Mexico’s Technology Applications Center. 














Figure 9.9. The location Sn es periment im projecting archaeological site occurrence using 
somulated SPOT data im Bandeher National Monument, north-central New Mexico (after Ingles et al 


1984:F tg. 2; scale = 1:24,000, known site locations shown as open squares). Datatrom the French SPOT 
satellite will dere from a multispectral scanner with considerably higher resolution than that 
provided by Landsat. Although the data will be more expensive to acquire, they may be more 
cost-effective than Landsat data for cultural resource management 
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Remote-Sensing-Aided Archaeological Predictions: 
Some Comparisons and Comments 


I here are, of course, ditterences among the numerous st sees im arc hae ologK al 
prediction summarized above. Some employed visual imterpretation, while others 
were based upon computer analy sis of digital data; some were approached as ways of 
designing samples using informed environmental stratification, while others were 
expli ithy directed toward the “pre lection § of arc hacologK al site locations, occur- 
rence vs nonoccurrence, or densities. The mathematical models used to compare 
dependent (cultural) vanables with independent (environmental) vanables vary as 
much im these remote-sensing-based approaches as they do in other iy pes of 


“predictive modeling’ that are not based on remote sensing 


These studies are basically the same mm one sense, however. None of these 
aitempts at prediction really constitutes prediction in the explanatory sense of the 
term advanced in Chapter 4. Each proyects empirically trom the known occurrence 
ot archacological sites to the probable occurrence of similar places i areas that have 
not vet bee n surveved | has 1s a reile xive excercise, and if 1s somew hat unsatisty ing 
in that there is no assurance that anv such projection wall be successtul until it ts 
tested, or that the next projection trom the same data will be similarly successtul of 
unsuccessful when tested. This ss because, regardless of whether they incorporate a 
modern and usetul technology like remote sensing, such nonexplanatory exercises 
do not focus on the s stemic level of the explanation of past human organization 
l he next section of thas hapte r will set the Stage fora dis« ussion of some of the wavs 
m which remote sensing might be used to produce more productive, explanatory 


mri dels 


In concluding this section I would ren erate mv caution that the use of Landsat 
and other remote sensor data should be carefully considered with regard to the 
lamitations of this technology. Remote sensor data exist in the present and are no 
more “reflections of the past’ than are contemporary archacological data. Such 


lands ape « haracteristics as the location of water and other factors « hange through 
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TABLE 9.2 


A 25 by 25 pixel (picture element) matrix of simulated SPOT data over part of the Bandcler National Monument 
study area. Known archarcological sues are shown as three-dign numbers m boldface. Known sac locations were 
correlated with zones classified using cluster analysis of SPOT -sumulated data (the single-digu mumbers), and a was 
found that a majormy of sues occurred in only a few zones 
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tume, and there 1s no assurance that the reasons why activities were located where 
they were in the past will have any sort of transparent relationship to what we see on 
aerial photos or space images. In the final portions of this chapter some possibly 
more realistic ways in which contemporary remote sensing data and contemporary 
archaeological data can be broughi to bear upon one another will be explored. 


POTENTIAL APPLICATIONS OF REMOTE SENSING WITHIN 
THE EXPLANATORY FRAMEWORK OF ARCHAEOLOGICAL 
MODELING AND PREDICTION 


So far in this chapter I have discussed remote sensing and what 11 1s, what some 
of its general lumitations in archaeology might be, and some of the ways in which 
archacologists have apphed remote sensing methods and data to expermments im 
predicting certain aspects of the archaeological record. Although remote sensing 
has been used in a number of different ways in these archacological experiments, 
their general method 1s uniformly one of empirical, inductive “prediction” as 
diagrammed in Figure 4.1 of Chapter 4. This exercise generalizes from known 
distributions of archacological sites or materials—known on the basis of pnor 
surveys or the compilation of extant site forms—to a “prediction” of what addi- 
tional sites or materials will be discovered in the future in areas not yet surveyed. 
This 1s accomplished through the tabulation or correlation of the differential 
occurrence ot archaeological sites with respect to differential distributions of envi- 
ronmental characteristics that are assumed to have been important to decisions 
about where sites would be placed. As discussed in Chapter 4, this 1s the method 
used in most of the “predictive modeling” efforts described m the archacological 
literature or in management reports today; the limitations of and problems with this 
method are also explored at length m that chapter. 


Chapter 4 also describes another way of thinking about modeling and 
prediction —as integral aspects of the process of archaeological explanation. Refer- 
ring again to Figure 4.1, the interpretations we make concerning the archacolog: 
record (that 1s, the meaning we assign to the remains that we encounter) are 
separated from the actual physical nature of the archacological record by many 
levels of phenomena. 





It as the physical archaeological record and is distribution that managers are 
interested im, for this 1s what they must manage. Meaning 1s given to the archaco- 
logical record, however, only through explanation, and meaning 1s essential to 
predictive or proyective statements about the physical archaeological record for two 
primary reasons. The first 1s that mm order to predict the locations of archaeological 
sites successfully we must know not only what “noncultural variables” they are 
correlated with, but also my. The answers to the questions “why?” must be posed 
in terms of systemic human organization, because systemic organization 1s the way 
that people differentially locate themselves and their activities on a landscape. If we 
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do not understand the specu mechamums of site placement, then there is no 
assurance that any prediction can be extended from a known area to unknown areas, 
even if these areas are immediately adjacent to one another. Such mechanisms can 
be known only through explanatory modeling, and through use of predictions to 
test these models. 


In Chapter 4 Kohler and I have suggested that the best independent vanables 
for site location models are ecosystem varnables. Before models that mcorporate 
ecosystemic variables can be formulated, however, the things that intervene in the 
real world between the physical archaeological record and past human organization 
must be addressed or “filtered out.”’ These are the factors listed in the “processes” 
column of Figure 4.1: discard behavior, depositional and postdepositsonal proc- 
esses, and the methods that archacologists use to discover, measure, and analyze 
the portions of the archaeological record that we find. 


The other reason that the meaning given to the archacological record — 
explanation —1s all-important m managing this record 1s easier to state but ulti- 
mately more difficult to define. As the volume editors have pointed out in Chapter 
1, the legal and, I like to think, moral reasons for even worrying about managing 
cultural resources are based on the significance of those resources im terms of 
research potential. Cultural resources are umportant because, by using them, 
archacologists may be able to say something worthwhile about the operation and 
organization of human systems and their components, past and present. The 
management of ugmfuant archacological resources has been mandated, and signifi- 
cance ts based on meaning given to cultural resources through the explanatory 
tramework of archacologwal science. 


I can suggest two * avs an which remote sensing has the potential for moving 
archacological prediction away from ample empirical generalization and toward 
more explanatory goals. it should be understood that remote sensing, while 1 can 
play a part in this reorentation of “predictive modeling,” 1s not the solution im 
itself. The real solution hes m the ability of archaeologists to change the ways im 
which they think about doing archacology — particularly we must discard the idea 
that archacological explanation 1s or can ever be cary. Remote sensing can only play a 
part in ehotteng archaeological thinking, but this part may be indispensable because 
of the umxgue and inclusive sorts of data that remote sensing can provide. Remote 
sensing can provide two specific and immediate classes of data that archacologists 
need: data pertinent to depositional and postdepositional processes, and data 
through which ecosystemic, rather than simply environmental, vanables might be 
measured. A tew experiments m measuring and using such data are reported below, 
along with suggestions concerning possible future directions. 


Another area im which remote sensing can aid mn the investigation and explana- 
tron of the orgamzation of past human systems 1s through applications to ethnog- 
raphy and ethnoarchacology. As emphasized previously im this chapter, remote 
sensor data are contemporary, and as such they might best be apphed to under- 
standing the relationships between ongoing hunter-gatherer and primitive agricul- 
tural systems. These relationships are one of the most exciting data sources for 
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archacolog:sts attempting to understand the operation of past systems because they 
have the potentual for suggesting some of the alternative ways that people adapt to 
differing conditions through tume and across space. Some of the wavs that such 
information can be brought to bear on the question of predictive modeling are 
suggested by Ebert and Lyons ( 1983), Kruckman (1972), Parrington (1983), and Vogt 
1974). 





Remote Sensing and the Measurement of Depositional 
and Postdepositional Processes 


The materials that people use leave the cultural context and enter the 
archaeological context when they are discarded; at some point after they are 
dropped on or intentionally burned under the surface of the earth, they come under 
the influence of depositional processes and are incorporated in sediments and soils. 
Deposition most often occurs im the context of aggradational processes that bury 
cultural matenals, although there are situations in which cultural materals remain 
on the surface of the ground. Some depositional processes are cultural, consisting of 
bunal by human activity, but these are less common than natural depositional 


events 


Materials burned un a detinable layer or “level” are often assumed to be the 
results of a single occupational episode (Conkey 1980), but this 1s not necessarily 
always the case. The nature of the deposited archacological record 1s controlled by 
the penodicity of occupation or use of a place and the relationship between this 
penodscity and the periodicity of depositional processes acting on cultural mate- 
nals. Artitacts that are dropped only sporadically might be covered by sediments 
left by depositional processes that occur more often than episodes of dropping, 
while artifacts that are lost or abandoned relatively continuously will often be 
subjected to depositional processes only after several episodes of site occupation 
have taken piace. In the latter case, the apparent “levels” will be the result of more 
than one ep.sode of site use. For mstance, if a site 1s occupied or ts the locus of 
activity several tomes between successive rainy seasons, more than one episode of 
activity may be represented in each depositional level. This poses problems for the 
archacologist who 1s attempting to sort out the results of periodic human behavior 
in that “demonstrably associated things may never have occurred together as an 
orgamzed body of material during any given occupation” (Binford 1982:17-18). 


Once cultural materials are deposited and become part of the archacological 
record, they are acted upon by another set of processes that can be thought of as 
post depositional. Most processes that disturb or act upon the surface or subsurface 
of the earth also affect archacological deposits. Such biological processes as faunal- 
turbation and floralturbation (Wood and Johnson 1978) modify deposited materials, 
as do a host of other mechamical and chemical events. Foley (1981) presents a 
taxonomy of natural processes responsible for the burial, movement, destruction, 
and modification of archaeological deposits (reproduced here as Figure 9.10). Dis- 
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carded matenals enter the archacological record through bumal by cultural or 
natural agencies; once assemblages are buned they may remain im place, or thes 
may be moved through stream action, sediment movement, faulting, or mass 
wasting. At the same time, certain materials may or may not be altered by physical 
and chemacal agencies while mm or on the ground. Foley (1981) also identifies & hat he 
calls small-scale oscallatson processes that act on the doscarded archacologscal record, 
mcluding water or wind actson, animal burrowing, root action, and human 
disturbances. 


Natural postdepositional processes can alter or destroy archacologscal mate- 
rials, but they also play a role that 1s vitally umportant to the archacologust: they 
expose these materials, making them visible and thus available for study. Most 
archacology carned out m the United States today is undertaken m the context of 
cultural resource management assessments, which entail systematic survey of the 
surtace of the earth in areas that are to be disturbed by reservoir construction, strip 
murung, ot other engineering and resource extraction activities. Burned archacolog- 
cal materials are not found during such surveys; only those cultural materials that 
are exposed but not totally destroyed are found and serve as the basis of archacolog- 
scal study and interpretation. When subsurtace testing 1s incorporated imto surveys, 
st can expose but a tiny part of buned remains. It 1s only during the short and 
relatively uncommon penod between the exposure of deposited materials and thei 
dispersal or destruction that these materials are available to archacologrsts tor 
study. For thas reason, st 1 critecal that archacologists carefully consider the nature 
and actions of the processes that make thei basic data available to them. 


There 1s no easy way for the archacologist to observe, characterize, measure, 
and predict depositional and postdepositional processes. Both deposition and most 
postdepositional alteration took place in the past, so these processes cannot be 
observed directly. In addition, the distribution of these processes probably vanes 
across the landscape. Analogs might be found m contemporary surface processes, 
however, which means that the forces that have acted on archacological materials 
(and possibly also their rates or the magnitude of their effects on the archaeological 
record) are potentially predictable. If such processes can be predicted, then at least 
some aspects of the depositional and postdepositional “formation processes” 
(Schutter 1983:675) mtervening between the materials discarded by past peoples and 
the archaeological record that we actually see today can be taken mto account. And 
such factors must be accounted for before we can attempt to predict the locations in 
which archaeological matenals can be expected. 


To most archacologists st seems reasonable to turn to geologists and geomor- 
phologists tor the details of such natural processes and of thew differential occur- 
tence and rates, but usually these disciplines cannot provide the necessary level of 
de: aul. In tact, when an archaeologist and a geomorphologist are mtroduced, the 
latter will almost always initiate probing questions about whether archacology can 
supply concrete dates for recent natural surface events. This interest on the part of 
geomorphologists has probably been the major umpetus behind the development of 
the subfield of geoarchacology (Butzer 1977; Gladfelter 1981), but mt us just the 














EBERT 


470 





reverse of what we want to hear. Most geomorphological studies are conducted m 
carcumscnbed places under specific conditions and are even more inductively based 
than archacology. Archacologusts need to be able to arnve at generahzations about 
the places in which different surface processes act to depowt and disarrange or 
preserve archacological materials across relatively large study areas. Fortunately, 
remote sensor data, with then wide areal coverage, may help to supply thus 
information 


One such remote sensing study was undertaken m am attempt to define the 
extent of different surtace deposits and thei archacological correlates im Chaco 
Canyon in northwestern New Mexico (Ebert and Guteerrez 1981). Chaco Culture 
National Historncal Park has been extensively surveyed for at least S) years owing to 
the spectacular and concentrated nature of its archaeology, and a data base of more 
than 1200 archaeological sites was available at the National Park Service's Division of 
Cultural Research for comparison with remote-sensing-asded mapping of surface 
deposits there. Previous geological and geomorphological studies had examined 
alluvial deposits and hillslope processes and thei rates, and these data provided a 
basis for photomterpretation and mapping of geomorphic surtace units 


Geomorphx units were interpreted by Ebert and Gutierrez (1981) using 16000 
scale actial color transparency photos wewed with a Bausch and Lomb vanable- 
power stereoscope; these units were tramsterred to 1:12,000 black-and-white ortho- 
photoquads and trom those to a 1:12,000 scale base map, which also bore the 
locations of archarologocal sites mm the data base. Two descriptions —landtorm and 
photointerpretive — were generated tor cach geomorphic unit defined, based on 
tone, color, texture, vegetation associations, and landform associations (Figure 9.11 


and Table 9.3 


Correlations between site locations and geomorphac surtace unsts (summarized 
im Table 9.4) were of mnterest relative to mmterpretations of the diflerences between 
locations where different types of sites were found by survey archacologists. Archasc 
sites, usually comssteng of smal! scatters of stone flakes, were found on the oldest 
visible surfaces mm Chaco Canyon. Sumuilarly, Basketmaker wtes were found primarily 
on stable and mmactive surfaces, as were the Pueblo I, Il, and Ill sites. Later Pueblo 
sites were found relatively more often on less stable surtaces, and the even more 
recent Navajo sites occur m high proportions on very active surfaces where older 
materials would ether be obscured or destroved. The smallest sites (as recorded m 
the NPS data base) are found m units with little or no alluvial or aeohan surface 
veneer, while larger sites predominate m fine-grained, mactive Quaternary units 
where sheetwash, uniform sedimentation, and relatively even acohan deposition 
would cover smaller occurrences but allow larger materials (masonry walls, for 
imstance) to proyect above the surtace 


Another remote-sensing-based study, which built upon the Ebert and Gutier- 
rez (1981) Chaco Canyon expermment, was carned out m the Green River Basin of 
southwestern W yormng (Wandsmder and Ebert 1983). Fluvial, aeohan, and gravita- 
tional processes have altered the landscape there im post-Plewstocene tomes, giving 
rise to what appears to be a vaned and diverse region when it 15 conudered on a 
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TABLE 9.3. 


Surface geomorphological unis, their designations, their photointerpretive recognition patterns, and their descriptions and summary surface 
dynamics as mapped using photointerpretive techniques in the study of postdepositional processes on the archaeological record at Chaco Culture 


National Historical Park in northwestern New Mexico 
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Deagnation 


Quaternary Tertiary pediment 


deposits (stable 


Quaternary Tertiary pediment 
deposits (transitional) 


Cretaceous sandy bedrock (buned 


Cretaceous sandy bedrock 


exposed) 


Holocene gulhes 


Holocene dam sedimentation 


Holocene soul pipes 


Quaternary alluvial tan |! 


(Quaternary alluvial tan 2 


Landjorm Photo Dewriptim 


Undulating surface with poorly mtegrated 
surtace dramages, dark in tone. Vegetation 
sage and grass 


lan sand texture, scattered vegetation and 
integrated surtace drainages. High drainage 
density, parallel to dendritic dramage 
pattern 


Laight tan-whate bedrock with discontinuous 
veneer of acolian sand, local sheetwash 
alluvium 


Laght tan-white exposed bedrock with very 
sparse vegetation; fine textured with jot 
patterns clearly visible. Cliff House and 
Pucture Rocks formation: 


Localized discontinuous dramage, 1-3 m 
deep, up to 300 m long. 


Well-vegetated, tan-shaped deposits behind 


dams or diversions 


Arcuate depressions or collapsed soil pipes on 
terrace cdges 


lopographically raised, wregular-shaped 
deposits; vegetation density shghtly higher 
than Qafy or Qr, 


Conical tan-shaped fill associated with mayor 
side canyons; hght-medium tone 


} 





Statylity Dominant Process 


Stable alluvial, colluvsal, and acohan deposits 
resting unconformably on croded Tertiary 
and Cretaceous deposits. Little runoff or 
sediment produced on these highly 
permeable deposits 


Produces significant runoff and high sediment 


yields; occupies zones between QT mesas 


P 4 
and unvegetated badlands 


Intermittent aggradation crosion by 
sheet wash, acolian processes 


Flat surface with lutle or no cower: shectwash 
and acohan crosiwn 


Unstable, rapidly croding 


Rapidly aggrading, anastomosis ¢ channels; 
date trom 1930: 


Soil piprng, mass movement, highly unstable, 


croding rapidly 


May or may not contam active, mcsed 
channels 


Relatively stable surtace, some graded to 
QT, surtace. May contain buried so:ls of 
humus-nch layers 
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Qal, 


Qds 


()st 


Qh 


(Qt 


(Quaternary 


{Quaternary 


(Quaternary 


Quaternary 


(Quaternary 


(Quaternary 


(Quaternary 


(Juaternary 


(Juaternary 


alluvium 2 


alluvium 3 


badlands 


fine-grained colluvium 


dune sand 


talus 


terrace | 


terrace 2 


White, mctsed, meandenng straght channeci 
in central valiey with vertical walls 


Tan-hght gray alluvium mm incised channels 
with steep banks, cuttmg through alluvial fill 
m side 


laght-toned alluvium associated with surtace 
traces of fossil channels, high vegetation 


Banded gray to dark brown beds following 
topography, high drainage density. Little 


vegetation 


Light brown, fine textured, wregular shaped 
deposits near extensive shale and sandstone 
outcrops. 


lught brown, linear, topographically high 
deposits associated with Ke bedrock mesas 
and buttes. Bushes; no grasses; no established 
drainages 


Medium-tone bands along base of sandy 
bedrock chitis. Large angular blocks of 


sandstone talus on shale slone. 


Highest terrace incised by current Chaco 
Arroyo (Qal,). Large areas of low rehet within 
main canyon. Vegetation sparse 


Discontinuous, hght brown, fine textured 
areas between Qt, scarp and active arroyo 


Qal)). 


Actwe alluvium, thickness 0-15 m 
Frosion aggradation domunant 


Actwe alluvium m major tnbutanes of Chaco 
Canyon, many mdinidual channels and 


cut fill segucnces 


Inactive alluvium, mostly reworked (al, 
material. J heckness 2-4 m 


Relatively umpermeable shales with 
imterbedded sandstones; covered by 0-0.7 m 
of weathered mantle. Easily eroded, active 
surtace. 


Sheetwash maternal derived from valicy 
sidewall sandstones and gentler shale slopes 
at their base 


No integrated drainag development, Intle 
erosion. Dunes aligned N 60-70° E where 
lnear. Thockness 0-2 nm 


Larger talus blocks stable, localized creep, 
sheetwash, and debris flow deposits 


Oldest mactive terrace; interbedded with 
alluvial tan, sheetwash, and colluvium trom 
side canvons 


Youngest terrace or floodplain of present 
arroyo (Qal,) in some areas. Stability vanes, 
1-3 m above channel 
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TABLE 9.4. 


Occurrence of known archacological sues and materials at Chaco Culture National Historical Park (grouped by cultural affiliation and sue size 
within geomorphu surface units mapped with photomterpretive techmques 
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Hg Hy ke Ay Ret, Qalp Ral; Ral» al; ge g @4 g Qi, > & ly Q F 
CULTURAL AFFILIATION 
Archax 0 0 03 05 GO 0 0 f 0 0 0 {) 0 0 0 0 
Basketmaker Ii 0 0 03 03 0 0 0 0 0 0 05 C 0 te 
Basketmaker II! G 0 Ge 10 ie i 10 07 0 10 iw iS 5 19 io 
Pucblo I 20 Q 06 4 13 22 20 $i li 27 32 0 2 17 » Iv U6 
Pueblo Il ow SO 12 27 0 22 30 6 % Pa.) »” 35 2a 4 24 3] 37 
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smali scale. The Green River Basin 1s quite and today and probably has been for 
some time; m most places rainfall 1s less than 400-500 mm annually. Even on high 
plateaus and slopes, vegetation 1s sparse, usually covering not more than 20 percent 
of the ground surface, which makes a remote sensing approach to surface deposi- 
uonal, erosional, and aggradational processes fairly straightforward. 


The Green River Basin seems to have been inhabited at relatively low popula- 
tion levels since the beginnings of North Amencan settlement. Paleoindian, 
Archaic, Fremont, and Plains Indian groups have left their remains there for at least 
10,000 years: st appears that there may actually have been little difference in the 
lifeways of these people over a long time span, although the Fremont were at least 
partially agricultural while the others followed a hunting and gathering way of hie. 
The majority of archaeological sites found in the Green River Basin are Archaic, a 
broad typological category encompassing virtually ail] materials dating from about 
9000 BC to historical times, with assemblages consisting of stone tools and debris 
and containing little or no pottery. Many “‘sites” found in the Green River Basin are 
hundreds of meters long and wide, contain tens to hundreds of hearths, and have 
relatively sparse but even distributions of lithic artifacts. These assemblages and 
features are very likely the result of the reoccupation of these places over many 
thousands of years, coupled with depositional and erosional processes encouraging 
the formation of superimposed assemblages or palimpsests. 


The Green River Basin experiment coupled the mapping of natural surface 
processes with an on-the-ground archaeological survey carried out by the National 
Park Service Branch of Remote Sensing in 1983 - 1984. This experiment was directed 
toward evaluation of the cultural resources on lands surrounding the Seedskadee 
National Wildlife Refuge along the Green River that are under the jurisdiction of 
the Bureau of Reclamation. The explicit goal was to incorporate remote sensor data 
into a predictive model of archaeological site locations and their characteristics. 





Before the zones of differential geomorphic surface processes affecting the 
archaeological record in a 559,000 ha (1,380,700 acre) study area could be mapped, a 
data source was needed that would provide a regional perspective while permitting 
discrimination of different sorts of areas with resolution at culturally and archaeo- 
logically relevant scales. Remote sensor data, particularly those denved from 
satellite-borne sensors, are ideal tor this application, particularly where little on- 
the-ground geomorphological mapping has taken place. The basic data source used 
in geomorphological mapping of the proyect area was a 1:100,000 scale Landsat 3 color 
composite visual product. Composed of an overlay of bands 4, 5 and 7 data from the 
Landsat multispectral scanner, this image has a ground resolution of about 80 by 80 
m and approximates a color infrared view of the imaged scene. Color infrared 
accentuates vigorous vegetation, permitting discrimination between areas of grow- 
ing plant cover and bare earth; this capability is particularly useful in defining 
differential surface processes. 


Mapping was initiated by overlaying a sheet of frosted mylar on the 1:100,000 
scale Landsat scene of the study area and placing these two registered sheets on a 
light table. Black-and-white photo prints at a scale of 1:80,000 and arranged in a 
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mosaic fashion were checked against the Landsat image to define the boundanes of 
zones of diflerent geomorphic processes on the mylar overlay. Geology maps 
prepared by the State of Wyoming and Sod Conservation Service provisional 
county soils maps for Sweetwater, Lincoln, and Uinta counties were also consulted 
during the interpretation process. 


Although the information available from the Landsat image, the aerial photo- 
graphs, and the maps was different, the three sources were found to be complemen- 
tary. The resolution of aenal photographs ts many times greater than Landsat 
resolution, of course, and permits identification of small-scale topographic pattern- 
ing. For snstance, individual sand dunes and imterdunal flats could be casily 
distingu:shed on the aerial photographs. Once areas characterized by dunes were 
located on the aerial photographs, the same areas were checked on the Landsat 
image, and the tonal and textural qualities of those areas were noted. By using the 
patterns identified in this way, we were able to detect additional dune areas directly 
from the Landsat image, subject to verification using the aerial photographs after 
such an interpretation was made. In some cases the geological and soils maps were 
useful in checking and placement of boundaries, although these maps were far more 
generalized than the geomorphological mapping done from the Landsat data. 
Photomnterpretation could have been performed using only the aerial photographs, 
but this would have required the construction of a control network (see Ebert 1984) 
for about 100 prints, a very difficult task. Landsat data are geometrically corrected; 
thus, these data are ideal for environmental mapping such as that undertaken in the 
Seedskadee project area. 


Fifteen of the larger geomorphological zones (Figure 9.12) were grouped, for 
purposes of discussion, under six general headings with assumed depositional and 
postdepositional sig.uficance: 


|. Terraces formed largely by fluvial processes. This class includes both 
presently active terraces and those formed in the more or less recent past — 
possibly as early as the Pleistocene In the most recently active of these areas, 
channel and overbank deposition dominate the depositional processes, while 
on earher terrace surfaces slopewash, sheetwash, and acolian processes are 
common. 


2. Playas and Flats consisting of relatively flat areas expenencing slow 
deposition of fine-grained sediments. Deposition in these areas is facilitated by 
either internal or external drainages. When dry, these areas are subject to 
aeolian deflation. 


3. Dunes, which in the study area occur not in extensive fields but rather 
interspersed throughout badlands, flats, or along the edges of intermittent 
watercourses where sand is plentiful. Some dunes also occur where mesatop 
scarps cause the wind to drop its sediment load. Presently active dune areas, 
which form the majority of the areas included in this category, are character- 
ized by connected crescentic or barchan dunes; at least two areas of earhier, 
relavively well stabilized parabolic dunes are also found in the study area. 
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Figure 9 i2 Ceomorphologn ai surtace units mnterpreted visually trom a Landsat color 


composite print and |:80,000 scale aerial photographs un and around the Scedskadec Natronal Wildlite 
Retuge on the Green River, southwestern Wyoming. Thus map «as compiled as part of a distributional 


archacologycal survey of the area ( W andsmder and t bert 1984). Results undwcate that much of what we 
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4. Badlands consisting of highly eroded shales with a dense and retsculate 
drainage. In most cases this geomorphological class 1s interspersed with flats, 
dunes, and small remnants of carber surface or “mesa” areas. 


5. Mbesatop areas, which are the more or less dissected remnants of carher 
Tertiary gravel sand bedrock surfaces. Four mesatop areas were distin- 
guished on the basis of their landform and by the fact that at least some 
vegetation cover (dominated by Basin big sagebrush [4riemuaa tridentata 
tridentata| and grasses) was distinguishable on the Landsat umage of these 
areas. 


6. Agricultural Areas irrigated with water from the Fontenelle Reservour or 
the Green and Black’s Ford nvers, which are extensively modified and 
probably need not be further considered by archacologists, at least by 
archacologists searching for surface remains. 


Archacological data were collected im this study area through a nonsite or 
distributional archacological survey strategy (described at length in Chapter 4) to 
test these formulations and are still being analyzed. One pertinent observation 
made during the collection of the archacological data was that the scale of surface 
processes with apparent relevance to artifact distributions may be tar smaller than 
the scale of surface processes that can be discerned on Landsat MSS or small-scale 
aerial photographs. More recently, surface geomorphological processes have been 
reinterpreted using stereoscopic photomterpretation of 1:12,000 black-and-white 
aerial photographs of the 500 by 500 m sample units within which field archacology al 
recording took place. While the imutial, small-scale photomterpretation was 
directed toward understanding general postdepositional characteristics across the 
study area, this second analysis wall be applied directly to the task of filtering out 
postdepositional processes affecting specific archacological materials found during 
survey. In order for this to be accomplished, it 1s clear that artifacts rather than sites 
must be the unit of discovery and recording. See Chapter 4 for a discussion of the 
advantages (and, | would suggest, the necessity) of a distributional archacological 


approach. 


Remote Sensing and the Measurement and Meaning of Ecosystenuc 
Variables for Archaeological Modeling and Prediction 


In Chapter 4 of this volume, Kohler and | have suggested that one avenue by 
which archacologists might move beyond the empirical, inductive generalizations 
that we currently refer to as “predictive modeling” 1s by attempting to use 
ecosystemic rather than simply environmental or landscape characteristics as inde- 
pendent variables. It is the organization of human sytem that we must understand if 
we are to explain the mechanisms behind mobility, the placement of activities in 
space, and the locations of discarded archaeological evidence. It was pointed out 
that at the systems level human organization responds not to the umique placement 
of specific resources at a single time and place, but rather to the regional spatial and 
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temporal patterning of all resources —that 1, to the orgamzation of the ecosystem 
as a whole 


There are abundant means for measuring ssemple environmental vanables— 
slope angle and aspect, distance to water sources, elevation, and the hke — and this 
%s probably the major reason why these quantities are used as variables im most 
contemporary predictive models. Measuring or even sdentifying ecosystem varia- 
bles »s more difficult, and the first step m using such vanables in modeling, 
predactson, and explanation will consist of research into new measurement tech- 
mques. Remote sensing us one source of such techmques that ts mcreasingly 
available to the archacologist. An example of a remote-sensing-based approach to 
the measurement of one possible ecosystem vanable —environmental diversity — 
will serve as an dlustration of possible research directsons. 


Earironmental Areraty, as the term 0s used here, 1s a measure of spatial heteroge- 
Newly mM resources; even m a very general sense it 1s obvious that this variable should 
have consequences for the organization of human subsistence behavior. In an 
environment where many different resource species are distributed evenly, a 
human group dependent on these resources should mimumuze energy expenditure 
by being sedentary and territonal; of resources are clumped rather than evenly 
distnbuted, then high mobulity will be necessary in order to explont the full range of 


resources. 


In order to examine the potential of this variable for explaming diflerences in 
human mobulity and resource procurement, Harpending and Davis (1977:276) have 
suggested a “model” consisting of a one-dimensional environment along which the 
occurrence of a vanety of natural resources 1s measured and tor which the abun- 
dance of each resource 1s graphed as a continuous function. The complex continu- 
ous function represented by each resource can be viewed as the sum of Founer 
components -——a series of sine waves of different frequencies added together — and 
the resulting power spectrum can be analyzed. 


Harpending and Davis imitiate their model from the stance that hunter- 
gatherer groups seck or desire maximum vanety in their diet, an assumption that is 
far from proven but one that 1s common in the Bushman literature and 1m fact in 
most literature dealing wth generalist hunter-gatherers. If this assumption 1s 
correct, however, it 1s clear that people pursuing such an sdaptation would seck 
areas in which to live and gather foods that had the maximum possible vanety of 
food. 


Harpending and Davis also ww pothesize that the benefit that hunter-gatherers 
derive trom increasing the size c! their range 1s greatest when resources are out of 
phase —that 1s, they do not co-occur perfectly — with a cycle of redundancy of | km 
to 100 km. When all resources occur together at discrete locations, the benefit from 
increasing range size should be less. Maximum range size would be expected where 
there are few resources and where those resources are maximally out of phase with 
one another over distances of 1-100 km; minimum range size should occur where 
resources show little spatial variation or where many resources co-occur. Harpend- 
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ing and Davis also suggest some umphications for group sizes: groups sith max- 
mum range sizes and extremely high mobility im low-abundance, out-ot-phase 
resource environments should be relatively small with poorly detimed local bound- 
anes (for instance, m the Kalahan Desert). In the minimum range-size category, 
small groups would be expected with little spatial resource vanation (¢.g., m 
tropical rainforests), while larger groups would occur when resource variances are om 
phase (tor cxampic, on the northwest coast of Novth America). 


A test of anthropological and archacological umplications of such expectations 
would depend on the measurement of spatial variation im resource patterning over 
large areas, something that 1s extremely difficult to do. Ecclogists measure such 
variation by counting and weighing types and numbers of plants, an expensive and 
time-consuming process even in small test plots. In addition, there 1s the very real 
danger in on-the-ground efiorts of becoming “too close” to the data, of placing 
emphasis on taxonomy and the specific properties of individual taxa as “determs- 
nants,” to the detriment of a wider perspective. For both economy of ctiort and 
maintenance of a regional perspective, remote sensing methods may be supenor to 
on-the-ground ecological measurements of environmental diversity 


Remote sensor imagery, particularly photographic or multispectral represen- 
tations of ground scenes, contains information on the reflectivity of diflerent parts 
of a scene covering a portion of the earth's surface. Reflectivity 1 determined by 
ground cover, soil type, topography, and an amalgam of other natural tactors — all of 
which would correspond to a greater or lesser extent with the distribution of 
vegetation. Since amimal life is dependent upon the patterning of primary produc- 
ers, remote sensor data should convey information about faunal resource distribu- 
tions as well. 


The limits of 10-100 km suggestec’ by Harpending and Davis (1977) as a 
relevant distance for the discussion of resource penodicities among human groups 
cover a significantly larger span than do most aircraft platiorm umages. For this 
reason, Landsat or other satellite scanner data may be the ideal media tor exper- 
ments in the measurement of archaeologically relevant environmental diversity 
One objection often raised concerning Landsat MSS data 1s its low resolution, so a 
consideration of the sufficiency of these data for spectral analysis of the sort 


discussed above 1s perhaps in order. 


As will be discussed later in this chapter, the penodicities of occurrence of 
resources ot of the landform characteristics that determine the distribution of 
resources constitute one property of the environment that can be measured to 
arrive at data that qualify as ecosystemuic. For instance, the ecosystemoc properties 
of an area may be very different if there are five apple trees and five orange trees 
than if there are 500 orange trees and 500 apple trees. A rule of thumb tor the 
measurement of periodicities from serial data, the Nyquist criterion (Gillespie 
1980: 149), holds that at least two samples per cycle of the highest spatial frequency 
information to be obtained from an image are required. A Landsat MSS image 
provides a ground coverage of approximately 185 by 185 km; to detect a 10 km 
spatial period, then, (2 * 18)* or {369 samples would have to be derived trom the 
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frame. Landsat MSS umagery contaims some 1.6 * 10° pmels per frame, nearly 8000 
tumes as many potential samples as would be required for such sampling. Data 
derwed through acral photography are even more detailed. Conventional acral 
photos contam about 4 * 10 pixels per frame, and hugh-resolutnon mages have 
several tomes that many pructs (Reeves 1975-1106). 


An carly remote sensing expermment carned out to assess the possibility of 
measuring archacologically relevant environmental diversity using acral photo- 
gtaphs focused on the lower Chaco River dramage and surrounding badlands and 
mesatop areas in northwestern New Mexico during a cultural resources survey of 
coal mumeng lands (Reher 1977). An mutual hypothess advanced as part of the 
explanatoon of Archasc sete densetees in the study area was that Archax ste densities 
should mcrease as a functson of mcreasung diverwty m vegetation (Reber and Water 
1977:114). This hypothesss was based on the assumption that Archax peoples 
pursued a generalist subsistence strategy, relying on a wide vanety of vegetal 
resources throughout the vear. This assumption may not be totally vald or 
realistic, based on subsequent research (Hogan and Winter 1983; Moore and W uter 
1980), but a discussson of the way on which diversity measures were obtained should 
help to pomunt the way for future eflorts m this direction. 


Two separate data sources were used to measure vegetation diversity: on- 
the-ground botanical survey and the analysis of aenal photographs. The aenal 
photographic measurements employed 1:6000 and 1:12,000 black-and-white and 
color transparency serial photos of the study area, which were analyzed useng an 
International Imaging Systems analog umage analysis system. One of the capabslities 
of this system is a graphic readout of density changes m the emulsson of a 
photograph placed on a hght table and wewed with a high-resolution video camera. 
Such a graphic readout of densities of course corresponds to differences in vegeta- 
tion, topography (shadow), souls, and other prowes of environmental diversity. 
Each photograph trom the areal coverage of the study area was placed on the hght 
table in turn, and the density graph of a north-south line across ts center was 
examined. Peaks in this graph with an amplitude greater than an arbitrary cutoff 
value were counted, thus providing a sumple, efficiently derived measure of the 
amount of variation in density across each photographic frame. The number of such 
graph peaks counted was assigned as a “diversity index” to the area covered by 
each photo trame (Ebert and Hitchcock 1977:212). 


A vegetative diversity index was independently derived from analysis of plant 
communities and associations measured on the ground; this index was found to 
correspond closely with the remote-sensing-derived index. Correlation of both 
indices with Archan site location data derived through transect survey wndacated 
that Archaic site density was highest im areas lying ummedhately adjacent to high 
vegetation or environmental diversity areas, but that the sites were not necessarily 
within these areas themselves. A possible explanation 1s that high-diversity areas 
are extremely vanable topographically and have active eromeonal and aggradational 
regimes. Thus, such areas may be mappropriate places to locate residential sites, or 
the archaeological record sm such areas may be obliterated or hidden (Reher and 
Witter 1977). 
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in 1979, a cooperative study to furtuer mvestagate the use of remote sensor 
data, this tume from Landsat MSS, tor measurmmg environmental diwersty for 
archacologxcal purposes was mutuated by the Nateomal Park Servece’s Branch of 
Remote Sensing and the US. Geologacal Survey’s EROS Program. it was proposed 
that this study would mcorporate analy sis of five 300 by 500 prc! (approxmately 
27.5 by 27.5 km) Landsat 3 MSS subscenes m the San Juan Basin near the 1977 Chaco 
Raver study area described above. The derivation of a dwwersty measure fom these 
subscenes was to be digutal, and the diversity measure so derved was to be 
compared with an extensve archacologscal computer data base that had recently 
been made available by the Park Service's Southwestern Regional Office m Santa 
Fe. 


Digital analyses was unde.taken at the EROS Data Center, a US. Geological 
Survey facility m Sioux Falls, South Dakota, using two digital umage analyses 
systems, the General Electric Image 100 system and the ESL IDIMS (Interactive 
Digital Image Manipulation System). Subscenes were extracted trom a Landsat 3 
MSS tape (data collected August 3, 1979) and rerecorded onto digital tape. These 
data were then analyzed using a maxemum hikelihood cluster classufier on the IDIMS 
system. A SO by 80 pixel area trom cach subscene that was pudged to be representa- 
tive of the variation within that subscene was first selected by the operators based 

oa the ecologx cover-type classification of the Sam Juan Basin discussed above 
(Camilli 1984). This small area was then randomly sampled to derive a tramung set of 
5 percent, or 20 by 4 pixels. A total of 164 such samples were derived trom the four 
subscenes. Using these sample’ as tramung sets, an unsupervised classification was 
performed, and 13 classes resulted. These classes were enterpreted and collapsed by 
the operators, again on the basis of the previous cover-type mterpretation as well as 
imternahzed knowledge of the areca, into seven new cover types, which were then 


mapped as zones (Figure 9.13). 


Once these steps had been completed, the EROS Data Center's Burroughs 
7600 computer was used to pass a3 by 3 pixel filter through the seven-zone classified 
image. For each nine-pixel area, the central pixel was replaced with a value of 0-6, 
indicating the number of classes other than the class represented by the cenrral 
pixel that were found within the filter. i hus resulted on the generation of a diversity 
index (Figure 9.14), but unfortunately, edge eflects relating to the direction that 
the pixel passed through the data set were introduced mto the results. Attempts 
were made to correct for this, but the configuration of the computer system at that 
tume was such that it could not be adaft ed to solve the problems. For this reason the 
proposed correlations between site occurrence and the diversity measure were 
never completed, although the method itselt shows considerable promise 


A number of things can be said about and learmed from this last attempt at 
measuring environmental diversity as an ecosystem variable with archacological 
relevance. The fires 1s that problems of coordination and equipment compatibility 
sometimes make i” umpler and more cost-eflective ior a manager to contract with 
an accountable screntist from the private sector tor remote sensing research than to 
rely on cooperative, imteragency agreements 
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Figure 9.13. A computer-generated image showing a digital classification of cover-type units along Gallegos Wash mm the San Juan Basin, 
northwestern New Mexico. Landsat 3 MSS data were analvzed using the IDIMS computer system at the EROS Data Center m Sroux Falls, South Dakota 


The results of an attempt to derive ecosystem variables from this classification are discussed mm the text 
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Figure 9.14. A digitally derived environmental diversity index resulting from further computer analysis of the cover-type classification shown in 
Figure 9.13. For each pixel in the classified scene the number of cover types occurring within a three-pixel radius was counted; this score was used to 
derive a diversity index. The darkest areas have the lowest divers — and the lightest areas have the highest diversity. Much past systemic behavior, 
including site location choiwe, may be more attributable to ecosystemic variables, such as diversity, than to specific vegetation or other resource 


composition, as discussed in the text 
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The second observation that might be made is that the technology of digital 
analysis of remote sensor data is changing so rapidly as to make analyses that were 
net practical using million-dollar systems only a few years ago possible today on 
small, stand-alone image processors. The RIPS (Remote Image Processing System) 
that Charles Robinove (1986) used to derive his Landsat-based diversity index in 
1984 1s now available to the general public as a $5000 add-on to most personal 
computers. This diversity measurement attempt also illustrates at least one appli- 
cation of remote sensing in which digital, pixel-by-pixel classification of data is far 
more useful than visual interpretation of an image into zones or areas of assumed 
significance, for it would be impossible to pass a filter through an image if it were not 
composed of pixels. 


Finally, this example emphasizes the fact that remote-sensing-based 
approaches to the measurement of ecosystem variables for prediction and modeling 
have not been perfected, and that it may not be easy to perfect them. Remote 
sensing approaches, like predictive modeling in general, can only be refined 
through cooperative research and development on the part of managers and 
archaeologists. 


The last point is one in which remote sensing can, | feel, play an especially 
important role in uniting the efforts of managers and archaeologists. Remote sensor 
data forms an integral and all-important part of most geographic information 
systems (as discussed in Chapter 10). Such systems have been undergoing intensive 
development, particularly by natural resource managers and scientists, for at least a 
decade. I see focus on remote sensing as a primary data source for predictive 
experiments in archaeology as one way of developing a common ground, an 
independent data base, and ultimately an analytical tool that can be shared by 
archaeologists, natural resource scientists, and managers. Such a common interest 
could do much toward uniting cultural and natural resource management. 
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Chapter 10 


GEOGRAPHIC INFORMATION SYSTEMS: TECHNICAL AIDS 
FOR DATA COLLECTION, ANALYSIS, AND DISPLAY 


Kenneth L. Kvamme and Timothy A. Kohler 


INTRODUCTION 


Timothy A. Kobler 


By this tome we have seen that predictive modeling of archacological resources 
may involve consideration of the characteristics of catchments around potential site 
locations, of distances to various resource types from potential locations, and of 
various characteristics of the potential site location tselt. Maps of several different 
resource . vy pes and landscape characteristics may cach need to be analyzed m terms 
of catc hment, distance, and pornt « haracteristecs. Locatsons satusty ing ceTtam crite- 
ria on all these maps may need to be sdentitied and located. Geographic mtormation 
systems are a computerized ad tor the collection, management, analysis, and 
display ot the large sets of spatially reterenced data require d tor suc h projects | his 
chapter begins with an overview of what these systems can do and then explaims 
their various capabilities in imore detail 


Beyond its obvious role in helping to organize, overlay, and display data, a 
geographic information system (GIS) also may help agencies to make cultural 
resource management survey and predictive modeling etiorts both more compara- 
ble from project to proyect and more cumulative om thee results. At present, the 
physical models—maps—produced by various archacologycal consultants are 
drawn to different scales, using different standards. It stead models were based on 
a single GIS, of on compatible systems at sdentical resolutions, then they could be 
readily compared, and the predictions made by one group of modelers could be 
tested by later surveys more accurately and convemently. Moreover, models could 
be casily refined and remapped, and the results of these refinements (and the 
differences between versions) would be readily apparent. A good case could be 
made that either agencies should mamta ther own GIS and require all contractors 
to work on it, of they should mamtain long-term arrangements with contractors tor 
the construction of data bases cont ammmng environmental data, site location mftorma:. 


tion, and predictive models so that the cvcle of model construction, testing, 
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revision, and verification could be carmed torward cumulatiwely. This chapter, 
however, consuders only the technical role, not the umphcations for polhcy, that 
geographic information systems may have m the predictive modehng process. 


Maps can be defined as scales for measuring the property of lates for some 
attribute (Lewss 1977:3-10). Map data ditier from other data m that the location of 
each feature relative to all others 1s mamtamed, making properties of location (such 
as distance) readily available for study. Most of the large computenzed software 
systems that archacologists use regularly (such as SPSS 2nd SAS) ordinarily maintain 
information in a sequentially organized data base. Location can be entered im sucha 
data base by introducing vanables for northing and casting, for example, but the 
internal orgamization of the data base usually remains random with respect to these 
vanables, and analysis of locational properties us cumbersome. 


In a GIS, on the other hand, the mternal organization of the data esther mimics 
that of the map from whuch it 1s distilled or 1s based on other conventions that allow 
the spatial structure of the mapped attribute to be easily reconstructed. This 
facilitates various spatial studies, such as those requiring distance measures (includ- 
mg catchment studies), and permits overlaying of various maps on top of each other 
so that the spatial mnteraction of the mapped attributes can be studied. 


A working GIS consists of software (computer programs), the hardware on 
which that software operates, and a spatial data base, but the term GIS ws often used 
to reter only to the software used for data entry, management, manipulation, 
analysis, and display. Many geographic information systems have separate systems, 
or subprograms, for these various major functional categories. There are probably 
well m +e than 100 geographic information systems m use around the world, im 
many tumes that number of mmstallatrons; access to 2 GIS by researchers and 
Managers in university and agency contexts will soun de commonplace 


( ompafative reviews of the most common systems ore now available: Hansen 
(1983) compares MOSS MAPS with IDIMS,; several systems that were orgrnally 
deugned to process remote sensing data, mcluding \1C AR and IDIMS, are com- 
pared by Bracken et al. (1983); and Enmkson et al. (1983) discuss three 
mactocomputer-based geographic mmtormation systems. Munro ( (983) draws on the 
cxpenence of a large corporation mm suggesting how a surtable GIS can be ofyectively 
selected trom those available. Systems used by the Domemon of Canada and by the 
states of New York and Minnesota are described by Tomlinson et al. (1976). F mally, 
the Amencan Farmland Trust (1985) tabulated costs, operating environments, and 
data entry, editong, updating, retneval, analysis, output, and display functions for 
65 geographic information systems, include 16 operating on microcomputers. Even 
such a recent pubbcation ws already somewhat out-ot-date, however, as both 
hardware and software developments m this field are occurnny very rapedly 


Tramoing om the structure and use of geograp.nc untor mat ‘tems os availa- 
ble trom several sources (Table 10.1). Articles relevant ' geograph. wtormation 
systems appear regularly om the pournals and conterence proceedings ber ed on Table 
10.2, and Estes et al. (1984) and Marble et al. (1984) present usetul colle ons of 
GAS-related articles 














TABLE 10.1. 


Selected training opportunities in geographic information systems 


GEOGRAPHIC INFORMATION SYSTEMS 





Orgaatzation 
Trasmung and Assistance Office 
U.S. Geological Survey 
EROS Data Center 
Ssoux Falls SD 57198 
605) 5994-6114 


Remote Sensing Institute 
South Dakota State Uniwersity 
P.O. Box 307 

Brooksngs SD 57007 

G05 | ORR-48 14 


Yale University School of Forestry and Environmental Studies 
205 Prospect St 

New Haven CT 0651) 

203) 436-0440 


Laboratory tor Apphcation of Remote Sensing Data 
Purdue University 

1291 Cumberland Ave 

West Lafayette IN 47906 

317) 494-6305 


Continuing Engineering Education Program 
George Washington University 
Washington, D.C. 20052 

202) 676-6106 


Graphics and Image Analysis Group 
Computing Service Center 
Washington State U uversity 
Pullman W A 99164-'220 

505) 335-0411 


U.S. Fish and Wildide Service 

Division of Biological Services 

Western Energy and Land Use Team 
Drake Creekside One, 2627 Redwing Rd 
Ft. Collens CO 80526-2899 


Geographic Information Systems Laboratory 
Central Washington University 

Ellensburg, WA 98926 

(S09) 963-1914 


System (cf any) 


IDIMS 


AREAS 


MAP 


LARSYS 


VICAR IBIS 


MOSS MAPS 


GRASS 
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TABLE 10.2 


Selected journals and conference proceedings containing more a¢vanced discusssons of geographic 
information systems 


JOURNALS 





Area 

Canadian Cartographer 

Computer Vss0a Grapha: and Image Procesang 
Compater: and Geosciences 

Computers, Exrsromment, and Urban System 

E ariromment 

F ersrommental Managemen! 

Geo 

Geographual Analyst: 

Geo-Procesamg 

IEEE Tramactiom on Grosceemce and Remote Semung 
IEEE Tramactiom on Pattera Analya: and M achoae Ietclhgence 
international Journal of Remote Semuang 
Photogrammetru: Emgunecring and Remote Srmung 
Remote Seaung of Exrirommeat 

Soul Survey and Land Evaluation 


PAPERS AND PROCEEDINGS 


International Symposium on Computer-Assisted Cartography 
International Symposiurt. on Cartography and Computing 
International Sympostum on Remote Sensing of the Environment 
International Sympossum on Spatial Data Handling 

Annual Meeting of the Amencan Society of Photogrammetry 


Proceedings of the Pecora Symposium 


ABSTRACTS 
Geo Abstracts, G: Remote Senung, Photogrammetry, and Cartography 





THE POTENTIAL OF GEOGRAPHIC INFC XMATION SYSTEMS 
FOR RESEARCH, DEVELOPMENT, AND APPLICATION OF 
ARCHAEOLOGICAL SITE LOCATION MODELS 


Kenneth L. Kramme 


The Need for Geographic Information System Techniques 


In the previous chapters, several methods and models for classifying a location 
or region as site-likely (or site-type-likely ) were introduced. All of thage procedures 
are based, at least during some stage of the modeling process, on Beasured data 
(where measurements can also refer to nominal-level class cages), and many 
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require large numbers of caiculations. The various quantitative approaches require 
measurements at each site location (e.g., of various environmental phenomena) and 
also at locations of background environment where sites are not present, termed 
nonsites, uf 2 control-group approach is used during initial model development (see 
Chapter 8). Similarly, measured data are required for all sites (and nonsites) in 
model testing phases. Finally, to apply most archaeological locational models across 
a region of s'udy requires tremendous numbers of measurements. For example, if a 
model based on several environmental variables is to be applied across some region, 
measurements of each variable might be required every 50 m across the region for 
sufficient resolution in application. The problems of making vast numbers of 
measurements and performing an even larger number of statistical calculations 
constitute the greatest difficulties in the development, testing, and practical 
application of many archaeological modeling strategies in regional cultural resource 
management contexts. 


For the simplest application of environmentally based models, such variables 
as slope, aspect, and distance to water can be measured by hand at a specific locus on 
a topographic map. A site location model could then be applied to the measure- 
ments (usually requiring a few calculations) in order to assess the “site likelihood” 
or “site favorableness” of the location. This approach can be quite useful to cultural 
resource managers in assessing archaeological sensitivity at, for example, proposed 
well pad locations. 


As the size of the area to be assessed increases, however (as the number of well 
pads mcreases and as access roads to the pads are included in the project, for 
example), the labor-intensive hand measurement and calculation requirements 
rapidly become impractical. Many projects on federal lands encompass large areas; 
in such cases, the logical approach would be to replicate the above procedure 
systematically across the area under consideration, performing the measurements 
(and calculations) every 50 m east-west and north-south, for example. The outcome 
would be a wide-area “site sensitivity surface” depicting favorable or likely loca- 
tions for cultural resources based on model specifications. Needless to say, perform- 
ing measurements of multiple variables at some point on a map is quite tedious; 


reply cing this process every 50 m or so, even over a small area, is incredibly 
tim) suming and therefore costly. In addition, once these : have been 
colle. ed, the time and expense for all of the calculatiogs require®'to apply most 


models must be considered as well. 


As an illustration of the magnitude of this problem, the effdrt that was mquired 
to produce a “probability surface map” of site presence for a single pues infin 
utilizing manual techniques can be examined (K vamme 1983a; this map is illustrated 
in Figure 8.1). To produce this map, six environmental predictors (slope, aspect, 
view angle, shelter rank, vantage distance, and distance to water) were measured 
by hand at 256 points ewenly spaced at 50 m intervals acrags the quarter section for a 
total of 1536 measurements. Next, the probability of each location’s membership in 
a site-present class, conditional on the measured data, was estimated by a preestab- 
lished —e function. The mathematical operations needed to assess one 
. 
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location required roughly e:ght additions, three subtractions, nine multiphcations, 
one division, and three exponentiations; for all 256 locations approximately 6144 
calculations were performed! Finally, it was necessary to produce a graphic of the 
result for each location, which constituted a mapping of the model; this required 
further effort. It is clear that application of this kind of model utilizing manual 
techniques 1s impractical for any but the smallest of regions. 


Manual techniques pose a number of problems in the model development and 
testing stages as well. Perhaps most apparent is the effective limitation of sample 
sizes owing to the excessive labor requirements of measurement. For example, a 
region might contain several hundred known sites, but it might not be possible to 
use all of them for model development or testing because of the difficulnes of 
measurement. This 1s even more likely to be the case for nonsites if the control- 
group approach is used, since potential sample sizes of many thousands of nonsites 
can in principle be obtained from the background envirenment. 


Perhaps a more serious effect of hand-measurement of variables is that a large 
amount of variation can be introduced into an analysis simply through measure- 
ment error. Significant differences can be observed between measurements taken 
by different people or in measurements made by the same person at different times, 
even for variables as easy to measure as distance to nearest stream or slope as 
percent grade. This factor can introduce major variation into the outcome of a 
model and can also affect the application of a model. 


A major disadvantage of manual measurement has become apparent only with 
the implementation of computer-based GIS technology in archaeological locational 
studies. Human measurement, primarily because it is slow and time-consuming, 
actually limits the kinds of phenomena that might potentially be examined, or even 
conceived, in site location research. For example, tor a given locus on a map (such as 
a site location), or even for several loci, it might be possible to estimate a least -effort 
travel distance (as opposed to a linear distance) to a nearest water source (discussed 
below), or it might be possible to calculate, as a relative measure of view quality, the 
percentage of terrain that is visible within a given area. It is not possible to do these 
kinds of calculations manually for many hundreds or thousands of locations (or, for 
example, every 50 m across a map area). In fact, since we inherently think in a 
“manual méde,” such variables are rarely even considered. This poses a serious 
Bevcains on archacological locational research. 





Archacologists are great gatherers of information. We collect data pertamming 
to where sites are found or even where individual artifacts are located. We gather 
information describing regions surveyed, the intensity of the survey, when the 
region was @rveyed, and who surveyed it. We collect data about site content, the 
locations of features and artifacts within a site, cultural affihation, various site 
components, arid the amount and kinds of work performed. Various ecological data, 
such as enviromental association: ht be recorded, as well as modern features, 
such asexisting roads, trails, dwgllings, and towns. It is important to recognize that 
much, perhaps most, of our data are geographically distributed; that 1s, they have a 
mappable component. A major problem is that it is often difficult to manage large 
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bodies of regional information and to retrieve particular information because part of 
the data might exist on maps while other information might be located in site forms, 
iN project reports, in published articles, or even in museum collections. The 
usefulness of our great collecting efforts is thus severely compromised. 


Finally, archaeologists have been working with unmanageably large, geograph- 
ically distributed computer data bases, such as digital representations of remotely 
sensed images or digital terrain models, for a number of years (e.g., Green and 
Stewart 1983; Lyons and Hitchcock 1977). These unwieldy sources of information 
are often difficult to analyze, explore, and manipulate, and it is not easy to arrive at 
conclusions about them (McLeod and jafek 1984). Various sources of data might 
occur at different scales, in several map projections, or might even be geometrically 
distorted owing to the tilted angle of a remote sensor platform, making it difficult 
not only to register one source of data to another (such that a particular point in space 
lines up with the same point in all the other data sources over the entire region of 
study) but to locate even a single point in space in all data sources. These problems 
are mayor limiting factors in the practical use of these data bases in regional 
archaeological investigations. 

GIS technology can virtually eliminate these problem areas and limitations. 
First, computers can perform many thousands of measurements of potentially all 
variables examined in site location studies in a matter of seconds and permanently 
store those measurements for later use. This virtually eliminates sample-size 
problems for known site locations and also permits us to obtain extremely large 
samples of the background environment (or nonsites) for comparative studies as 
well. Second, such complex calculations as probability estimates can be performed 
quickly and in large numbers. Third, cartographic capabilities inherent in a GIS can 
provide maps of virtually any result quickly and at low cost. Fourth, variation in 
measurement is entirely eliminated: the computer produces the same result every 
time. Fifth, depending on the ingenuity of the user, the available software, and the 
software developer, the potential for creating and exploring new types of informa- 
tion of relevance to archaeological research and problem solving in site location 
studies is limitless. Last, geographic information systems provide a comprehensive 
system for the management of large, diverse, and unwieldy geographic data sets 
obtained from virtually any source, such as site files, aerial photographs, remotely 
sensed imagery, or conventional maps. All types of information, despite their 
original disparity, are referenced to a common geographic coordinate base (such as 
longitude and latitude or the Universal Transverse Mercator grid), providing a 
logical means for data storage, manipulation, retrieval, and interpretation. Thus, 
only through GIS capabilities does +t become possible to utilize many of the data and 
approaches toward understanding and modeling prehistoric site distributions that 
have been outlined in this volume. 





The following sections describe in greater detail the mechanics behind geo- 
graphic information systems and their capabilities for archaculogical locational 
research. The material in these sections is not necessarily limited to a discussion of 
what existing geographic informatior. systems are able to do. Rather, the goal is to 
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present what a GIS can potentially offer to archacology without the restnctuion of 
working with existing systems, since tew have been designed with the archacologist 
in mind. Thus, the use of a GIS to provide measurements of such concepts as terrain 
vanabihty, view quality, vegetation diversity, or pomt-to-point visib:lity, for 
example, will be discussed. The ability to compute such data, of course, may not 
available in most commercially produced geographic information systems, yet it 1s 
these kinds of data that are vital if geographic information systems are to be useful 
tools, rather than restrictive tools, for archacological research. Archacologists 
should certainly have the ability, monetary or otherwise, to mfluence software 
developers to provide necessary computer programs, and many archacolog:sts are 
rapidly gaining expertise as computer programmers themselves. Moreover, many 
governmental agencies employ programmers to meet the various information needs 
of their personnel. Hence, there is little reason why archacologists should not have 
access to a GIS with capabilities tailor-made to meet their analysis needs. 


The Fundamentals of Geographic Information Systems 


Geographic information systems are computer-based means for assembling, 
analyzing, and storing varied forms of data corresponding to specific geographical 
areas, with the spatial locations of these areas forming the basis of the system 
(Tomlinson et al. 1976). The term GIS, as used here, is restricted to computer 
systems that are able to interrelate sets of data representing different geographical 
variables, as opposed to systems that merely manipulate or map individual files of 
geographical data (Rhind 1981). As Bryant and Zobrist put it, geographic informa- 
tion systems “seek to capitalize on the synergism mbherent in being able to 
automatically compare a variety of socioeconomic, environmental, and land use data 
sets for the same point on the ground” (1977:120). 


Virtually any type of geographically distributed information from any source 
can potentially be encoded in computer-compatible form. By using a GIS it 1s 
possible to extract information from digital geographic data bases, manipulate the 
data, derive new data, and analyze this information to propose solutions to prob- 
lems. Thus, geographic information systems are able to transcend the role of merely 
processing and displaying information; they also can be mmcorporated into the 
analysis, interpretation, and problem-solving aspects of research in geographically 
distributed phenomena and processes (Hasenstab 1983a). 


Many types of geographically distributed data can serve as the primary 
information portion of aGIS: elevation data, river and stream locations, vegetation 
patterns and soil types (which might be derived from satellite remote sensing), 
known archaeological site locations, and regions of planned construction or devel- 
opment are examples. At its simplest, a GIS can be used to retrieve spatially 
distributed information that is encoded in data bases for a specified coordinate 
point, such as the locus of a small archaeological site. 
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Such a procedure, however, does not fully umhze a central capability of 
geographic information systems—the ability to derive new information beyond 
that orginally encoded m the data base (Collins and Moon 198! ). For example, from 
interrelationships between known points of elevation in the data base it 1s possible 
te estimate, at any locus, values of slope, aspect, and a vanety of local rehef and 
terrain variability measures, or major dramage basins can be defined using the same 
data (Monmomier 1982:76-79). Points of vantage, such as hilltops and mdges, can also 
be determined (K vamme 1983a). From a digital hydrology net, distances to nearest 
seasonal or permanent streams can be computed, and trom dignized vegetation 
data, distances to a specified plant community (Lee et al. 1984), complex indices of 
vegetation diversity, or even local caloric potential can be measured. Listings of 
nearest newghbor sites and distances can be obtained, as well as the distance to a 
central place village trom a data “laver™ containing known archacological site 


locations. 


An important benefit of the data-generating capaly)itics of geographic infor- 
mation systems 1s that mformation that was previous Possible to obtain ow ing 
to the sheer number of required calculations can be denved. Maximum view 
distances, measures suggesting shelter or view quality, and least-effort travel 
distances are al! potential mformation classes that illustrate this property. The next 
section discusses m greater detail the nature of these vanous ana/yraal vurtaces. 


GIS Analytical Surtaces 


A central GIS concept 1s that of analytnal surface, which retets to the individual 
“layers” or data planes of information i a geographic data base (National Research 
Council 1983:41 -43). Primary sources of intormation necessary for the construction 
of a GIS must be encoded in computer-compatible torm. For regsonal archacological 
research, primary information might include environmental data, such as elevation 
contours, river and stream locations, and vegetation and soils types, as well as 
cultural data, such as known archacological site locations, archacologically field- 





mspected regions, access roads, and areas of planned development or mmpact. 


It 1s possible to obtain through the U.S. Geological Survey or private compan- 
res many types of geographical data, particularly regional elevation data, already in 
digital form and on computer tape. For example, digital terrain tapes, which were 
originally produced by the Army Map Service (now the Defense Mapping Agency ), 
are available at low cost from the U.S. Geological Survey (Natronal Cartographic 
Information Center 1980). The digital terrain tapes were produced by digitizing the 
elevation contours on 1:250,000 scale topographic series maps, and they are available 
for the entire United States (Doyle 1978: 1484). As might be expected, these data are 
somewhat crude owing to the scale of the orginal map sources, and recent studies 
(Stow and Estes 1981) port to maccuracies in the resulting elevation surtaces (¢.g., 
small ndges, drainages, and canyons are underrepresented ) 
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As an alternative, the USGS 1s currently producing highly accurate digital 
elevation models (DEMs) that are obtained through digitization of 1:24,000 scale 
topographic maps (Doyle 1978:1484). Not only the elevation data but also other 
classes of planimetric information, including hydrologic and cultural data, such as 
road networks, are available for \>ese maps. A limitation of this data source 1s that 
only a small percentage of the quadrangies across the country have been digitized to 
date, although the USGS ultimately pians to digitize all the 1:24,000 scale maps. For 
a particular study region, high-quality elevation and hydrologic data, two of the 
most important sources of information for archaeological locational studies, may 
already be available in digii.i form. It is unlikely, however, that other sources of 
information, such as vegetation and soil data, will be available in digital form, and 
archaeological data certainly will not be available. As a result, it is often necessary to 
digitize these data electronically. 


A common digitizing procedure utilizes a digitizing tablet and cursor (Mon- 
monier 1982:7; Rogers and Dawson 1979). With these devices, such pictorial infor- 
mation as elevation contour lines or stream courses are manually traced and 
encoded in computer-compatible form (Figure 10.1). The table may contain as 
many as a millon x,y coordinates per square inch (Calcomp 1983); as the lines are 
being traced they are electronically converted to corresponding x,y coordinates that 
the computer 1s able to utilize. This procedure 1s, of course, somewhat labor 





























Figure 10.1. Manual digitizing of contour hnes through use of a cursor and digutirmng tablet. Pictorial map 
miormat ron, athued tothe tablet, ns conveTted to ry COooT tenates by manually Proset romeng the cross havrs oft the curser 
over the mended pomi and pressing a button. The keys on the cursor control different functsoms of allow entry of 


category codes 
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intensive. For example, digitizing the elevation contours on a typical USGS 7.5- 
minute quadrangle can take anywhere from one to six (or more) person days, 
depending on the complexity of the terram. State-of-the-art digitizing technology 
utilizes optical scanners to diguize compicx pictorial information mm seconds (Leber! 
and Olson 1982), but this equipment can be very expensive 


The primary data are usually derived trom traditional maps, but other sources, 
such as preclassified or interpreted remotely sensed digital satelite wmages, can be 
used (Shelton and Estes 1981; see below). However they are acquired, the several 
primary surtaces of digital information that the GIS needs are encoded and stored in 
the mnutial data base (Figure 10.2). Computer programs then are abie to utilize these 
primary data to denve secondary information that often 1s more usetul than the 
primary data (Collins and Moon 1981). For example, slope estumates, aspect esti- 
mates, or distances to nearest dramages might be derived (trom clevation and 
hydrology surtaces, respectively ) and stored as new and distinct analytical surtaces 
(Figure 10.2 


ORIGINAL GROUND 
SURFACE AND PRIMARY SECONDARY 
THEMATIC MAPS SURFACES SURFACES 





Fegure 16.2. Construction of a GAS. From the ongemal land surtace (b), vanous thematu maps are produced, such as 
cievatron comtowrs («), bydrolegy (d), and terested areas (¢). These maps are dyed and converted to premary lavers ma GIS 
represent ing an elev atron surtace (1), a by drel “—) surtae g and a torest hecateon swrtace (hh), whech are all reterenced to a 
reterenm ¢€ grvd, Sia h as the | 14 grid a broom the ebes atean sopttace soch of cmadary surtaces as vhompre ! aspect +). and bon al 
relet k preg ht bre omamned. The he drolog\ suring ng ht pyre whe a secondary surtac¢ shove nig dustance to mearest dr anage 


1), and the terest lecateon ewrtace mught veeld a swrtace showrng diet ance te nearest forest (m 
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A basic principle of geographic information systems 1s that the users provide 
the system (through digitization or other means) with the minimal information that 
it needs (primary layers). The GIS uself subsequently denves secondary sources of 
data by means of varnous software techmques. Both primary and secondary surfaces 
can then be used for analytical or deeginv purposes. The specific ways in which these 
data are utshzed, however, depend on the nature of the particular GIS. 


GIS Types 


There are two fundamental GIS desgns. A vector-based GIS, such as the 
Department of the Intenor’s MOSS (Lee et al. 1984), stores data as a sermes of pomts, 
lines, or polygons that are used to sdentify ‘rete features that typically occur on 
traditional maps (rector 1s anow®™. word for a line between two points). A cell-based 
(sometimes called raster-based) GIS, such as the Department of the Intenor’s 
MAPS (a subsystem of MOSS), superimposes a regular grid contaming rows and 
columns of cells over the regyon and assigns a numeric value to cach cell (Figure 
10.3). Each design has certain advantages and disadvantages in terms of archacolog:- 
cal locatyonal analysis and modeling. 


V ector-Based Geographu laformation Systems 


Vector-based geographic information systems accommodate information dig- 
itized as points, lines, or polygons (1.c., mappable data; Figure 10.3). Computer 
storage requirements for this information are minimal since only the coordinates of 
digitized points (points along line or polygon boundaries) are stored. A vector- 
based system 1s suitable for cultural resource mformation management since various 
mappable entities —archacological sites, site boundanes, surveyed regions, and 
archacologically sensitive zones—are casily retneved and displayed, as are other 
types of discrete map information (¢.g., specific soil type locations). A vector-based 
GIS can also be used for the display of very simple site location models that are based 
on a one-to-one correspondence between the locations of sites and discrete catego- 
nes of information, such as plant community or soil type locations (Thompson 1978; 
see also Cordell and Green 1983). For example, if a site location model suggests high 
site density in pinon-juniper settings, 4 vector-based system can easily present a 
} senes of polygons showing the locations of high-site-density pinon-juniper zones. 


Although vector-based geographic information systems can be used to manage 
and display discrete classes of map data, these systems are unsuitable for many of 
the analysis and modeling techniques described in earher chapters. In analysis and 
modeling contexts, systematic measurements or observations of environmental or 

other features are required (¢.g., every 590 or 100 m across a region of study). In other 
| words, spatially contiguous values of the data are necessary. In vector systems such 
in‘eumation 1s not available; data values are present only at point, line, or polygon 
boo anes, which constitute only a very small portion of any region. This short- 
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systems after Lee et al 


coming 1s further illustrated by the fact that even if continuously varying map 
information 1s available mm digital form, such as elevation of slope values, such data 
must be transformed to line data by contourmg or by categorizing the continuous 
measurements into discrete classes (¢.g., level vs steep slopes) to be handled by a 
vector GIS 


Cell-Based Geograph Information System: 


With a cell-based of raster GIS, both categorical and continuous map mtorma- 
tron can be incorporated. Since a grid 1s superumposed over the entire regson, each 
analytical surface 1s composed of rows and columns of grid cells, each cell corre- 
sponding to a fixed area mm real space and each contamung a value for that area 
(Figure 10.3). For example, an elevation surface would contam an elevation m each 
cell representing the height of the ground; a slope surface would contaim a slope 
measurement in each cell; and a nominal-level surface, such as a representation of 
plant community locations, would contain a wmque value m each cell, with cach 


value corresponding to a specific plant community class. Since a value must be 
stored for each cell for each analytical surtace, cell-based geographic information 
systems typically require large amounts of computer storage. Owing to the gridding 
of rasterization of features, the quality of display of information can suffer to some 
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extent (Figure 10.3), although thes depends or the resolutson (suze) of the cells and 
of the display device (see below). These difficulmes are decreasing, however, 
because mass storage and high-resolution display devices are rapedly becoming 
available at low cost. 


Sence cell-based geographx miormation systems can accommodate continu- 
ously varying and categorical information, treating cach as a surtace of contrguous 
values, and sance they can cauly denve and store many types of mew data of 
relevance to archacologacal inquiry over entire study regsons, this type of GIS 1s well 
suited tor archacologscal bocatvonal analysss and modeling research. Additionally, 
with a cell-based GIS cach analytical surtace, regardless of type, can be treated as an 
“umage™ (referred to as a prudo-rmagy mm wmage analyses). This means that the 
researcher can make use of the large number of available wmage analyses, manspula- 
tion, and classification techmues (see Chapter 9 for an overview of some of these), 
as well as a host of umage-processng soft ware packages (see Kohler’s bret overview 
later on thes chapter). The following sectvons focus on cell-based geographic mfor- 
mation systems smce they are better susted tor the archacologscal analyws and 
modchng approaches discussed mm this volume 


GIS Issues 


Several weues on GIS research are of umportance to archacologscal modeling 
applications. One issue us that of cell size on a cell-based GIS (Wehde 1982). The size 
of resolution of the cells us extremely umportant because a determines the nature 
and quality (accuracy) of the features that can be analyzed. For nomunal-level 
features, such as vegetation community locations, a large grid may severely mus- 
represent the true shapes and sizes of the categornes, whach may result m maccurate 
border and area estemates (Figure 10.44). For contenuous data, such as an elevation 
surface, large cells tend to smooth features of the terram,; small ndges, canyons, or 
dramages mght be underrepresented, less pronounced, ot even mvusible on the 
gridded surface (Figure 10.4b). An additronal result 1s that any surtace derived from 
such an elevation layer (¢.g., slope, aspect, rehet, and ndge identification; see 
below) will also be smoothed 


Although small cell sizes may portray various features more accurately, an 
important conuderation 1s that computer storage requirements increase geometn- 
cally with decreased cell uze. For example, to store the mnformation from a typical 
7.5-minute USGS map gridded in cells 100 m on a side (about one-sixth of an inch on 
the map) would require about 15,000 cells per laver of data; cells 30 m on a ude 
(about one-twelfth of an munch on the map) would require about 60,000 cells per layer. 
Thus, some balance must be struck between cell resolutson and storage require- 
ments. It should be emphasized, however, that small cell suze does not necessarily 
guarantee accuracy. It ws technically possible, tor mstance, to merease the resolution 
wm any data plane (say from 200 m to 30 m on a side), but of the data were mutually 
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bacrete clave an hecett ¢ ¢ } rte BR } lata mh as an elev at riace we oe ‘ 
mav be adequate to dupla - wht be adequate, tewulteng mn a smoothed surtace (left 
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Another emporta ee Se registration of 
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each cell mn the other layers. This 1s a partocular problem «hen combing data from 
such diverse sources as actial photographs, remotely sensed unages, and a vanety of 
Map projections and scales. A wide vwarnety of proce dures tor re gistratron of multiple 
data sources can be tound in a number of standard mmage-processing sources (in 
particular, Mowk 1980:187- 198; Schowengerdt 1983:99- 116 
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F igure 10.5. Steps that might be tollowed on the construction of an ¢c:evation surtace \ Digitized 
poms are indicated on a contour line. (B) The digitized pomts are placed in eppropriate grid cells. (C) The 
celis between the digntized cells are filled m. (D) In a gridded or rasterized verson of the ongmal contour 
map, contour line cells conta the elevatyon valuc of the contour; empty cels contam a zero. (bE) The matial 
sutface of interpolated clevations 1s “nowy.” (FE) The tinal elevation sutiace » smoothed 
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elevation might be estimated from known values that are quite different from the 
values used to estimate adjacent elevations, resulting in some disparity between 
adjacent elevation estimates. The final step in creating an elevation surface, called 
smoothing, attempts to remove this noise by providing a better elevation estimate at 
each location (Allan 1978:1518; Monmomier 1982:65-66). This smoothing process 
(which 1s distinct from the detrimental smoothing caused by large cell sizes) 
recognizes that elevation estimates in adjacent cells, because of their proximity or 
hi autocorrelation, should also be good estimates of each cell’s elevation. A final 
estimate in each cell is therefore typically accomplished by taking a weighted 
average of each cell’s elevation and the elevations in the adjacent cells (with most 
weight being given to the current cell). The more familiar smoothing in one 
dimension is illustrated in Figure 10.7a, while two-dimensional smoothing is shown 
in Figure 10.7b. The resulting surface, without the artificial peaks and valleys, is 
illustrated in Figure 10.5f. 


Other primary surfaces are somewhat easier to obtain (if not already availabie 
commercially). For a hydrology net, the stream locations are digitized in much the 
same way as elevation contours (Figure 10.8a). The digitized streams are then 
placed in grid cells to form a rasterized image of the hydrology net (much like the 
rasterized elevation contours in Figure 10.5d). The streams, however, might be 
coded to reflect permanent or seasonal water (Figure 10.8b) or Strahler ordet ranks 
(Figure 10.8c; see Chapter 8 for a description of the Strahler order ranking system). 


Rasterization of polygonal areas, lines, and points, which are used to describe 
discrete classes of information, such as vegetation communities, soil types, archaeo- 
logical site locations, and archaeologically field-inspected regions, is fairly straight- 
forward. Digitized polygons are merely transformed to a gridded version of the 
polygons (Figures 10.3 and 10.4a) using various polygon-fill routines (MacDougall 
1971: 117-126; Monmomier 1982:68-73). Polygon cells that represent a particular class 
are assigned a unique identification number. 


Secondary Surfaces 


An infinite number of secondary analytical surfaces of potential importance to 
regional archacological research can be derived from the primary surtaces in a GIS 
framework. Two common types are slope and aspect. Based on interrelationships 
between the elevation of a grid cell and those of its nearest neighbors in the 
elevation surface, some algorithms (¢.g., Woodcock et al. 1980) fit a least-squares 
plane to these elevations and find the maximum slope and the direction of maximum 
slope (aspect) on this plane (Figure 10.9a). Other algorithms might find a maximum, 
minimum, or average slope (¢.g., MOSS; Lee et al. 1984). 


A variety of terrain variability measures are easy to obtain from the elevation 
surface (see Chapter 8 for more detailed discussion of these variables). For example, 
local relief (maximum minus minimum elevation) can be obtained within any 
defined radius of a given cell (Figure 10.9b). Another terrain roughness measure ts 
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nowy trend (left) 1 compared to the same trend afier smoothing (night 
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Figure 10.8. Encoding of hydrology data. (A) Digstuzed pomts are mdicated on a hydrology net. (B) The 
stream locations are placed m grid cells. Seasonal water might be coded as “1” and permanent water as “2.” (C) 
Streams mught also be coded according to Strabler order ranks 


termed a texture measure in image processing (Moik 1980:232). This measure finds 
the variance of elevations within a defined radius or “window” of a given location 
(Figure 10.9b): high values suggest variable or rough terrain while low values 
suggest level or smooth terrain. Fragmentation indices (Monmonier 1974) provide 
other analytical alternatives. 


Hilltop, mesa edge, and ndge crest vantage locations might be defined using a 
variety of techniques (¢.g., Kvamme 1983b). For example, the previously derived 
slope data plane might be used to define all level locations (¢.g., those with grades 
less than or equal to 15 percent) adjacent to or «ithin a certain distance of steep 
locations (those with grades greater than 15 percent). The elevation surface is then 
used to delimit those locations (cells) above the adjacent steep locations. 


An angle of surrounding view, one possible measure reflecting quality of view, 
can be obtained from the elevation surface simply by calculating for each cell the 
angle that encompasses all elevations in the surrounding eight cells that are less 
than the current cell's elevation (Figure 10.9c). A “view catchment,” another 
possible measure of view quality, might be calculated by fixing a | mi radius around 
each cell and calculating the percentage of cells within that radius that are visible 
from the current cell (Figure 10.9d; Lee et al. 1984). 


More traditional catchments might be calculated using a nominal-level vege- 
tation layer. Given a fixed catchment radius around cach cell (Figure 10.9d), the 
proportion of vanous plant communities within that radius can be obtained and 
stored in separate derived layers. Alternatively, some index of vegetation diversity 
or complexity or some estimate of caloric potential might be calculated. 


Search and distance-measuring routines can be used to derive a vanety of 
analytical surfaces; the MOSS-MAPS system, for example, has several (Lee et al. 
1984). The nearest specified water type (¢.g., seasonal, permanent, or a stream of 
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Figure 10.9. bt xamples of various computational algorm hms (A) A least-squares plane might be 
fitted to an elevation (the central sphere and shaded cell) and us exght nearest nexghbor elevations 
The maximum slope on this plane might be calculated, along with the direction of maximum slope 
aspect). (B) Local relet might be calculated as the range m clevatrons m a three-by-three window 
around acurrent clevatson. Alternatively, the variance of the elevations might be calculated to derwe a 
texture measure. (C) An angle of wew could be calculated m an elev ateon surtace. (D) A catchment 
radwus can Ne tte d around acell. Are as oT percentages of the teature of mnterest can be cak ulated 
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specified Strahler order rank) might be located from a prima: ydrologic net, for 
instance, and the honzontal Euclidean distance could be calcu »ted for cach cell 
(Figure 10.10a). In conjunction with the elevation surface, the ver cal distance to 
the same drainage type might also be obtained. If hilltop, mesa edge, » mdge crest 
vantage points are already defined, search procedures can be used » obtain a 
distance to nearest vantage within cach cell or, using a vegetation community 
surface, the distance to a specified plant community. Linear distances, however, 
might not be the best measure to use in site location studies (Encson and Goldstein 
1981); because there often are obstacles to cross, people do not normally follow 
straight paths. If appropriate software is available, and definitions of “effort” can be , 
made (see Turner 1978}, least-cflort travel distances might be estimated mstead 
(Figure 10.10b). 


Geographic information systems can accomplish many of the same tasks using 
“cultural” vanables as they do fo: environmental ones. For example, if central place 
sites are defined in the data base, then distance from each cell to the nearest central 
piace can easily be generated. Similarly, based on the locations of known archacolog- 
ical sites, various orders of nearest neighbor site distances can be calculated. 





These examples ‘lustrate the kinds of phenomena one might potentially 
investigate in site location studies throug) the use of GIS techm jues. Such mvesti- 
gations are limited only by our ability to innovate and be creative (and by CPU and 
storage requirements )! 
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F igure 10.10. tlbwetrateon of deet ance calculatyon tex hmques. (A) lo obtamn hear destance the computer scans 
trom acurrent cell woth search radu of on reasing length wnti the teat ure of enterest 1s encountered. ‘B) Measurement of 


least-etiort trave! dist ance might consider paths that avoed hulls 
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Geographic Information Systems and Remote Sensing 


As discussed wn Chapter 9, the potential of remotely sensed data for a number 
of applications in archacology and elsewhere us beyond question. Recently, a 
number of remote sensing speciahsts have noted that geographic information 
systems “have significant potential to facilitate use of remotely sensed data™ 
(Shelton and Estes 1981:395). A key problem im remote sensing, for example, 1s that 
remote sensor imagery 1s usually geometrically distorted; for these data to be useful 
in apphed contexts the mterpreted information must be transferred to a standard 
geometrical base or georeterence (Stemer and Salerno 1975-622). Tilted or obhque 
satellite images must be rectined to a horizontal reference plane. The rectified 
image then must be geometrically corrected mto a particular map projection, such 
as longitude and latitude or the UTM system. The emportance of these tasks 1s 
recogmized by the ict Propulsion Laboratory (JPL), a center of state-of-the-art 
remote sensing and image processing. McLeod and Jafek (1984:75-76) note that 


Perhaps the most prodygous technolagy mtroduced by the lab « that of the geographu 
information system, which co- registers and analyzes a vertually lemutiess supply of sensor 
data types, and then relates them to key geographical questions wathen a grven reguon 
At one pole of true state-of-the-art emage processmng, GIS « the reverse of the maging 
techmaue that solely enhances ummediate visual recognstsen either a particular wemr of 
mage data set. Rather, GIS ws JPL 's amewer to the need tor analy en of ummanageably 
large data bases and the need to make responsible decesons about them 


Each amage ws first entered mnte the data base and gerometrually corrected betore 
berg registered to the “plarumetrc base™ of system of data planes. Lach emage plane ws 
again reterenced to one or more georeterence planes. The uset ms thus able to manepulate 
data trem several sources whuch, despite thew orgenal duparity, are referenced to a 
commen base 


Since geographic information systems interrelate multiple geographic data sets that 
are tied to specific locations, it 1s clear that the JPL system, although it primarily 
uses remotely sensed data, meets this definition. 


There are other reasons why geographic information systems and remote 
sensing should logically be linked. In recent years vanous forms of ancillary data, 
such as digital terrain models (see above), have been incorporated mto remote 
sensing applications. During a proyect that developed classification models for forest 
cover type based on remotely sensed spectral data, for example, it was discovered 
that incorporation of ancillary terrain data, such as elevation, slope, and aspect, 
significantly improved the classification accuracy of the predictive models (Hoffer 
et al. 1975; Strahler et al. 1978; Woodcock et al. 1980). Although spectral signatures 
could distinguish plant cover types to a fair extent by themselves, it was found that 
the distributions of many plant groups were also related to such factors as ground 
steepness, aspect, and elevation (Hofler et al. 1975; Strahler et al. 1978:930), varia- 
bles that were not readily obtainable trom the remotely sensed mmagery. By 
merging digital terrain models and the remotely sensed spectral data into a single 
analytical data set, not only could the elevation data be obtained, but through 














vanous software techniques, estumates of slope and aspect could be denved, 
allowing more powerful predictive models to be developed. The success of these 
approaches has led to applications using more vaned forms of ancillary data com- 
bined with remote’: sensed umagery. Missallat: et al. (1979) combined detailed 
geologsx map date, « rmagnetx data, and radsometric data (all digitally encoded) 
with Landsat spectral mformation to develop predictive models for uranium explo- 
ratson. Loveland and Johnson (1983) combined remotely sensed data with digital 
terrain data and digital soil survey, land ownership, and pumping plant location 
data to develop predictive models to evaluate immgation agriculture. This project 
showed, as Loveland and Johnson put ut, “the flexibility of remotely sensed and 
other spatial data as put for predictive models” (1983:1183). 


Geographic information systems are potentially useful for manipulation of 
geographic data regardless of thei source. Recently, this fact has generated consid- 
erable mterest im remote sensing circles (see Shelton and Estes 1981 for an over- 
view). A new perspective has arisen that suggests that the focus of research should 
be on the region under investigation (rather than on particular sources of data) and 
that a// relevant sources of information, regardless of type or derivation, should be 
sought for input into the regional GIS. Potential data sources include traditional 
thematic maps and a variety of remote sensor inputs. In this context, the GIS treats 
each analytical surface, regardless of source, as ssmply another data plane. The GIS 
18 able to facilitate manipulation, analysis, and modeling of these varied data ty; 
treating information sources individually or in combination. <" 


The wmportance of incorporating remotely sensed data into comprehensive 
geographic information systems 1s summarized by Shelton and Estes (1981:417): 


the tull potential of remote semung cannot and will not be achueved without contmnued 
and expanded ettorts to adapt the technology to the evolving needs of usets around the 
world. To the extent that geographic mmtormation system deugns reflect those needs, 
GAS design ought to be a relevant concera om the development of new satelite systems 
and om establishment of umstitutional arrangements for processing, formattong, and 
drssemunating the products of remote senwng 


Asa final caveat, however, they note that geographic information systems represent 
an evolving technology. Since remote sensing can contribute to the development of 
a GIS, ¢.g., by providing vaned forms of data mput, they conclude that full 
acceptance of horh of these technologies “is dependent on realization that the 
potential of each technology will not be acheved until they are integrated.” 


The Potential of Geographic Information Systems for 
Regional Archacological Research 


GIS techmques may potentially contnbute m a number of ways to regronal 
archaeological site location research and modeling, and these techmques may have 
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numerous applications to cultural resource data base management as well. Some of 
these potential applications were suggested in a foregoing section on fundamental! 
concepts; the following sections will elaborate on these suggestions and add several 
additional ones. 


Spatial Data Management 


A GIS can consolidate and merge many and deveras Kemne of geographically 
distributed information into a single data base. This is perhaps the most obvious 
application of GIS technology to regional archacological research. Since archacolog- 
ical data inherently are geographically distributed, they are well suited to a GIS 
context. Varied forms of archacological data, such as archacological site locations, 
site types, regons that have been field inspected, and cultural resource sensitive 
locations, can be merged into a single data base, together with varied sources of 
environmental and other geographically distr yied data. Sources of information 
can be as diverse as traditional topographic maps, thematic maps (soils, vegetation, 
geology), aerial photographs, and remotely sensed spectral data (Kvamme 1986; 
Parker 1986). . 


In a regional geographic data base established for the explicit purpose of 
developing, testing, and applying predictive archaeological locational models im 
southern Arkansas, Scholtz (1981; see also Parker 1985) utilized a cell-based format 
containing 3479 cells, each representing an area of 4 ha (200 m sq). Fifteen biophysi- 
cal variables, including soil type, elevation, slope, and distances to streams of 
various orders, were measured in each cell. Once the data were measured and 
formatted within a single computer data base, an exceedingly powerful tool was 
established for investigating environmental patterning exhibited by the locational 
known sites and for formulating and mapping the results of archae ‘a9 a 
toric and historical locational models. 


Hasenstab (1983b) developed a GIS for archaeological predictive modeling m 
the Passaic River Basin of New Jersey. This data base was established by electrom- 
cally digitizing a wide vanety of conventional maps and aerial photographs. Envir- 
onmental data included soil type, landform, slope, drainage, agncultural potential, 
current land use, degree of disturbance, type of modern development, and distan- 
ces to the nearest major river course, to confluences of major rivers, to tributaries, to 
confluences of tributaries with major rivers, and to major wetland zones. Manage- 
ment data included the location of known prehistoric and historical archaeological 
sites, a gross river basin division, USGS quadrangle reference, and locational 
coordinate information. Most of these data were generated from other digitized 
sources; the information was stored in 4306 georeferenced cells, each representing 
an area of approximately 1.15 acres. 


Digital terrain tapes were used as the basic data source in a western Colorado 
study that attempted to model prehistoric archacological site locations (Kvamme 
1983b). Six secondary surfaces, representing slope, aspect, angle of view, local rehef, 
vantage locations, and distances to nearest point of vanta,e, were generated from 
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the initial elevation surface for cach of $000 cells, which measured 100 m on a side. 
Stream courses were manually diguiuzed, and the stream locations, together with 
horizontal and vertical distances to nearest streams, were included mn the data base, 
as were the locations of known archacologscal sites. Other secondary surtaces, im the 
form of various probability surfaces of archacological site presence (based on various 
combinations of vanables), were also generated from these data. 


The Granite Reet Archacological Project (Brown and Stone 1982) made exten- 
sive use of a GIS for management of the project's data and for purposes of spatial 
analysis and archacological modeling. The Granite Reef project encompassed a 
huge area of west-central Arizona, more than 12,000 m:’. A vanety of bassc environ- 
mental data was encoded for cells measuring 1.16 mi on a side, including elevation, 
slope, basin divides, aspect, mayor watersheds, geologic classes, soil classes, vegeta- 
tion classes, seasonal precipitation, and clevation-adjusted temperature extremes. 
Encoded archaeological data included the locations of regions surveyed by archaco- 
logical field teams and a vanety of site types, ranging from habitation sites to lithic 
scatters, sherd scatters, rock rings, rockshelters, rock art, and prehistoric trails. 
Based on various arguments and notions about the relative importance of each of 
the environmental factors to the prehistoric occupation of the region, the GIS was 
used to develop a number of prehistoric land-use models that were weighted 
composites of the basic environmental dat 





Regional GIS data bases tor a southern Federal Republic of Germany study 
area and a southern Colorado study area are described by Kvamme (1986; also see 
Chapters 7 and 8). These geographic information systems have similar characteris- 
tics in the nature of the data planes that were established and in their purposes: 
archacological locational modeling. Both systems include such data as elev @on, 
slope, aspect, and measures of local reliet, view quality, vantage locations and 
distances to nearest vantage, and shelter quality, along with the complete hydrol- 
ogy network, horizontal and vertical distances to streams of various Strahler order 
ranks, and the locations and types of archaeological sites (approximately 200 sites im 
the German data base and 1200 sites in the Colorado data base). The German GIS 
contained nearly 80,000 cells, each encompassing | ha, and the Colorado GIS 
contained approximately 230,000 quarter-hectare cells. Both systems were used to 
establish archaeological locational models based on logistic regression probability 
functions; these models were stored as separate GIS surfaces 


In the above geographic information systems various sources and combina- 
tions of management and environmental data, such as archacological information 
about a particular site and its environmental properties or scaled maps of any surface 
r combination of surfaces, can be retneved. One of the chief uses of the geographic 
ata bases in all of the above studies is to examine and test environmental hypo- 
theses about archacological site locations and to develop various © tthement pattern 
models, including those used for the explo purpose of prediction. 
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Gearcration of New Data 


The use of a GIS makes « possible to derwe new data and to explore new 
vanables and measurement concepts. The ability to denve new data from primary 
information imtially encoded m a geographx data base was discussed at length m an 
eather section. The speed and accuracy of computers not only allow vast quantities 
of information to be generated but also permit extremely complicated and time- 
consuming measurements to be performed. The large numbers of measurements of 
clevauon, slope, aspect, distance to water, etc., that can be produced demonstrate 
m part the tremendous workload capabilines of computers. Another example 
involving the computation of pomt-to-pomt visibility through the use of an eleva- 
tion surtace dlustrates the complexity of calculations that can be periormed. From a 
given location (grid cell) of known elevation, one algorithm first approximates the 
straight-line path through the reference grid of cells to the desired pot or grid cell, 
which 1s also of known elevation. If cells in the straight-line path contain an 
elevation higher than the highest of the two end-point cells, a determination of ao 
risthihty 1s ummediately made, if the intervening cell elevations are all lower than the 
lowest of the two end-pommt cells, a determination of rasbhiity 1s ummediately made; 
otherwise the standard point -slope formula 1s invoked to determine the equation of 
the line-of-sght between the elevations of the end-pomt cells. In this therd case, the 
actual elevation for each mtervening cell 1s compared with the computed line-of- 
sight elevation at that cell locus to determine if visibility 1s blocked (Creamer 1985). 
Performing this procedure by hand between only two locations would be uncredibly 
tume consuming. Performing such a procedure between many hundreds of hilltops 
is impossible without the use of a computer 


Computer Cartography 


Within a GIS it 18 easy to display information using computer graphic carto- 
graphic techniques. Advances in computer graphics and cartography (¢.g., Edwards 
and Batson 1980) allow maps to be produced rapidly and accurately, incorporating 
uses of color, shading, and three-dimensional perspective that are unavailable in 
traditional cartography. The flexibility of computer graphic and cartographic 
techniques can increase the umportance of these methods as research tools m site 
location studies. Simply by producing maps of individual analytical surfaces, a 
researcher might gain insights that could be useful im formulating analysis plans or 
in interpreting analysis results. In addition to traditional maps displaying elevation 
contours and a hydrology network, maps of new concepts, such as distance to water, 
aspect, terrain variability, or vegetation diversity, can be produced. Rather than 
simply producing * map of site locations, the researcher might create a map of an 
extrapolated site location pattern, which could lead to better msight into the nature 
of prehistoric land-use patterns. Animation techmaques (Moellering 1980) might be 
used to portray such dynamic processes as landform croson, air-flow patterns 
(Tesche and Bergstrom 1978), or changing patterns of settlement through time 
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Certain analytical surfaces from a GIS developed for imvestigating prehistoric 
patterns of settlement m southeastern Colorado (Kvamme 1984) can be used to 
illustrate these sdeas. Five analytecal surtaces from a 230 mr’ portion of the study 
area, contaimag approximately 230,000 cells, cach 50 m on a side, are portrayed m 
Figures 10.11-10.13a. Figure 10.1 1a0s a slope surface. Steep locations (ceil) ) are dark 
and level locatuons are ght. The surtace depicting aspect or princepal onentation of 
the ground surface 1s shown m Figure 10.11b. In thas figure, ght shading represents 
south-facing terrain while dark shading represents north-facing terram. Note that 
this surface tends to portray features of the topography related to the dramage 
systems. The complete hydrologic network 1s portrayed as the white lines in F gure 
10.124. Also portrayed in this figure are distances to the nearest of these dramages. 
This information was computed for cach of the nondramage cells, but here, to 
facilitate display, these data are represented by shading that indicates five catego- 
nes of distance. A similar map ts given mm Figure 10.12b, but only a subset of the 
streams (second Strahler order or greater) 1s portrayed. Finally, a local rehef surface 
is depicted im Figure 10.134, whech portrays relative terrain roughness and offers 
contrast between locations of greater and lesser rehet. in each cell the range m 
elevation within a 300 m radwus has been determined; high rehet values are dark and 
tend to portray high plateau mm, hulltop, and canyon regions, while low rehet values 
are light and portray plamnslike areas. All of these maps portray the same region, but 
each offers a different way of looking at the landscape 


Perhaps by noting how the distribution of known sites corresponds with these 
and other surfaces an investigator mught better be able to select vanables to 
examine or on which to concentrate mm later analyses. Alternatively, an analysis 
might suggest that ceri ain variables bear a strong relationship with known locatyons 
of a particular type of site. In any case, viewing a picture of the mapped variables 
(Figures 10.11-13a) can give the researcher added insight about hus or her findings. 


Eraluation of Spatial Statiutus 


Geographic information systems can be used to examine and evaluate sam- 
pling designs and various statistecal models. An established regronal GIS with 
known population parameters can be used to investigate (through sumulation) the 
effects of different sampling designs within the region. It might be possible, for 
example, to investigate a vanety of hypothetical sampling designs prior to fieldwork 
in an effort to fine-tune a particular design to the characteristics of the region under 
study. 


In a somilar vein, it 1s possible to mvestigate a vanety of spatial statestecal 
models and issues. For example, most statistical procedures assume independent 
observations, but i 1s usually not possible to meet this assumption when sampling 
from spatial contexts owing to the presence of positive spatial autocorrelation (Chil 
and Ord 1973; also see Chapter 8). Positive spatial autocorrelation has the effect of 
altering the performance of various statistical models; ¢.g., levels of significance 
tend to be overstated (Haggett et al. 1977:329-377). It mmght be possible to use GIS 
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Figure 10.11. GilS-generated surfaces depx ting 230,000 individual measurements contained m 90 m’ cells across a 230 mi’ central Colorado study region 


A) Slope surface: dark regions represent steep ground and hight regions represent level ground. (B) Aspect surface: hight regions are south-facing and dark 
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data bases as a means of empirically investigating the periormance of vanous 
statistical models mm spatial contexts under comrolied condmons, wah known 
autocorrelation structures, perhaps allow mg various model correctsons to be made 
tor example, Clift and Ord 1973 


In a ssmulatson study that used a GIS to mnvestigate levels of spatial awtocorre- 
lation under vanous geographx sampling designs, the cfiects of this problem with 
regard to vanubles commonly used in regional archacology al research were exai.- 
med (Kvamme 1985). The GiS-based semulation used a 10 by 10 km regyon as the 
sapling universe, and tor cach of five runs of the semulation a different sample 
random sample and a difterent regular svstematu sample of 100 locations (1 ha grid 
cells) were selected Spatial autocorrelation statistics were calculated for cach 
variable tor each run. The results indicated extremely high levels of positive spatial 
autocorrelation regardless of sampling design ( some of these results are presented m 
Chapter 8) 


Testemg Locational Hypotheses 


GIS data bases can be used to test archacological locational theonmes and to 
address other research questions. When a variety of promary and secondarily 
denved environmental and cultural vanuables have been previously calculated tor a 
study region m a cell-based GIS, the need tor additional measurement can be 
cmunated. The locations of all known archacological sites om the regron can be easily 
and rapidly correlated with environmental and other features m the data base 
Alternatively, the relatronships between GIS data base teatures and various sub- 
samples of known sites, sites of speciiic functional types, or sites belomgung to a 
particular penod of time can be investigated. For mvestigators wing a control. 
group approach as a plan tor research (see Chapter 8), very large nonsite samples of 
background environmental or cultwral data can be obtamned both tor model devel- 


opment and tor model testing 


Cell-based geographic mntormation systems are ideally suited for an analytical 
approach to site location research that treats the undividual cell (which corresponds 
to a parcel of land) as the unit of analysis, especially when the cell size ws tarrly 
small —e¢.g., the size of a typical prehistore sit > of smaller. Cells that are townd to 
contam artifacts or other cultural remamms are semply “flagged” by the computer, 
thus elummmnating ste detinition problems sence the site us no longer the wnt of 
analysis. Relationships between the flagged cells and environmental and other 
teatures included m the data base are then examined during model development 
Analy ss might compare characteristics of cells contammng no prehustore evidence 
with those of cells that contam prehustorse evidence, tor example. Once ertenia have 
been detined tor sdentitymng functional site types, site type analyses could be 
conducted by notong whach cells exhibut the required criteria and by flagging cells 
with a specific witc type code. Alternatively, wnce tunctron « often difhcult to 
determuane it mmght be possuble to rank (or comtenwously measure) cells that contam 


cultural evidence accordong to artitact diversity of to amounts of mterred prehus- 
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toric activity, using various threshold levels of amounts of prehistoric evidence. 
Various location models might then be developed in which the dependent variable 
is an index of artifact counts, diversity, or levels of prehistoric use. 


GIS data bases are well suited for testing certain types of site locational 
theories. It might be postulated, for example, that certain kinds of arcliaeological 
sites in a study region should be located close to sources of water. A GiS data base 
could be used to determine empirical distances to water at krvwn sites of the type 
under investigation in order to test this hypothesis. !t should be recognized, 
however, that a// parts of the study region might generally lie clc.x« to water sources. 
Hence, even ii the sites tend to be located close to water sources, this tendency 
could be a result of the nature of the background environment rather than of 
prehistoric selectivity, for example. Measurements from the background environ- 
ment might yield a distribution of distances to water identical to that for sites, 
which would suggest no select: ity, or the distributions might be radically differ- 
ent, suggesting selectivity. GIS .echniques are ideal for investigating such an issue 
because they can provide many thousands of background measurements of envir- 
onment against which known site distributions can be compared. 


To illustrate the power of geographic information systems for analysis pur- 
poses, a simple histogram is presented in Figure i0.14a thot illustrates the Euclidean 
distance to the nearest drainage of Strahler order rank two or greater as measured 
by a GIS in 230,000 contigucus cells (50 m on a side) in central Colorado (Kvamme 
1984). This figure clearly illustrates the nature of the background environment in 
this region with respect to this variable. The histogram of the same variable 
measured only at the locations (cells) of nearly 600 known open-air lithic scatters in 
the area portrays a distinct tendency for the sites to be located in relatively greater 
proximity to second-order streams (Figure 10.14b). For example, 50 percent of the 
sites occur within 150 m of second-order or greater drainages, while only 17 percent 
o1 the study region as a whole exhibits a similar proximity to these drainages; 90 
percent of the sites lies within 950 m of the drainages, while only 69 percent of the 
study region lies within this distance. Since the sample of open-air sites was 
obtained through a random sampling design, the patterning apparent in Figure 
10.14 1s difficult to refute and points to the tremendous potential of geographic 
information systems for archaeological locational investigations. 


Locational Modeling 


GIS techniques are well suited for the development, testing, and application of 
archaeological locational models of any type (see, for example, Chapter 8). The only 
limitations are that appropriate forms of geographically distributed information 
(including remotely sensed data 2nd specialized map or aerial photograph data) 
must be merged into the data base and that the cel) resclution or size must be 
appropriaie for the modeling problem. 


In developing quantitative models based on probability or mathematical 
functions of multiple geographic variables, geographic information systems can be 
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Figure 10.14. Histograms of Euchdean distances to the nearest second Strahler order or greater stream in a central Colorado studs 
area. (A) 230,000 distances measured every 50 m across the study region represent the nature of the environment as a whole with respect to 
this variable. (B) 538 distances measured only at the locations of a representative sample of open-air lithic scatters. A comparison of the two 


histograms suggests a tendency tor sites to be located in proximity to these dramages 
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used to obtain environmental and other variables at the locations of known sites (or 
site types) to provide the basic analysis data. During model testing, geographical 
data merged with a second sample of sites can be retrieved and used as a basis for a 
variety of accuracy tests (see Chapter 8). Finally, geographic information systems 
can be employed to specify the results of a model across a region of study by 
applying the model (1.e., the probability or mathematical function) to the data 
stored in each cell and producing a map of the results. 


Figure 10.13b illustrates a prehistoric “‘site probability surface’ derived 
through a logistic regression technique (see Chapter 8) for the site class of 
“open-air lithic scatters’’ in the central Colorado project described earlier. This 
model is based on a sample of nearly 300 known open-air sites and acontrol group of 
approximately 1200 locations representing the background environment (nonsites). 
In each of the 230,000 cells in this figure an estimated probability of site-class 
membership was derived, conditional on seven environmental variables within the 
GIS data base (including those illustrated in Figures 10.11-10.13a). Computer 
cartographic techniques were used in Figure 10.13b to shade cells having p-values 
nearest to | with dark tones, to shade cells with p-values near 0 in light tones (or 
unshaded), and to shade cells with intermediate p-values in intermediate tones. The 
result is a visual representation of the extrapolated pattern of open-air site place- 
ment, based on the sample data. 


This model was also tested using a GIS. Test results from an independent 
validation sample of an additional 300 open-air sites and 1200 background locations 
sugges: that about 95 percent of the sites (92-97 percent at an approximate 95 
percent level of confidence) should occur in all the shaded zones of the map, 
although these shaded areas constitute only 62 percent of the total land area. The 
results also indicate that approximately 20 percent of the sites (16-25 percent at ca. 
95 percent confidence levels) should occur in the highest sensitivity zone (the 
darkest shading level), which covers less than 4 percent of the total land area 
(Figure 10.13b). 


For deductively derived modeling approaches, model development cannot be 
carried out within a GIS framework since these approaches do not begin by seeking 
patterns in empirical data. Such models are based on theoretical principles concern- 
ing human choice and settlement behavior and consist of deductions about the 
locations at which human occupation should occur. Once these models have been 
established, however, geographic information systems can be used for model 
testing and broad-area applications. 


One problem in applying many deductively based models lies in data require- 
ments. For example, to apply central-place modeling techniques (Johnson 1977), 
which assert the importance of central places to a regional pattern of settlement, 
one must know the locations of contemporary central places. Gravity models 
(Hodder and Orton 1976:187), which emphasize the importance of specific natural 
resources (¢.g., food resources or lithic quarries) or cultural entities (¢.g., road 
networks or central places), require locational data for each of these phenomena. 
Models based on caloric cost-benefit or energy calculations (e.g., Casteel 1972; 
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Zubrow 1971) require detailed environmental information. In one modeling 
approach based pnmanly on environmental data, Jochim (1976) was able to arrive at 
several deductions concerning hunter-gatherer settlement by synthesizing a wide 
range of ethnographic and other information. Unfortunately, the required data for 
application of the model, which included detailed information about such items as 
the food potential of several prehistoric plant and animal species, their relative 
proportions, and their seasonal abundance, were so difficult to obtain im a reliable 
form in the time period and region to which the model was apphed (tne Mesolithic 
of southern Germany) that it was difficult to realize the full potential of the model. 


GIS techniques may offer a solution to some of these problems, provided that 
the relevant data can be gathered and incorporated within a GIS framework. A 
variety of map sources or even zoological models might be used, for example, to 
describe the distributions of certain species of interest, and remote sensing tech- 
niques might be used to identify prehistoric central places, road networks, mayor 
plant groups, favorable plant diversity, or other features. Once the archacological 
locational model is formulated and made operational in computer terms, computer 
mapping techniques in conjunction with GIS features provide an easy means of 
applying the model across the region of interest. Testing of any model demands 
similar procedures regardless of how the model is developed (testing procedures are 
described in detail in Chapter 8), and as described above, geographic information 
systems are well suited for model testing purposes. 


The test study region of 19,000 grid cells that was used to illustrate the 
quantitative models in Chapter 8 can be used to indicate the potential of geographi- 
cal information systems in an a prion model specification perspective. Whether an 
archaeological locational model is derived simply through a senes of “shotgun” 
questions put to a GIS or through a senes of deductions concerning the interrela- 
tionships between certain environmental teatures and the positioning of human 
settlements in space, a GIS can be used to map the results of the modeling process. 
As asimple example, a base model might specify that settlements should occur on 
ground surtaces with slopes less than or equal toa 12 percent grade (Figure 10.15a). 
The next refinement of this model might then suggest that settlements should be 
found within a fixed distance, say 1000 m, of relatively secure water, such as second 
Strahler order or greater streams (Figure 10.15b). Finally, the model might be 
amended to include the requirement that particular settlement locations (c.g., 
those of winter villages) will have a south-facing onentation (Figure 10.15c). At each 
stage in the development of this model, accuracy, in terms of the percentage of 
known sites correctly classified and the percentage of the region classified by the 
model as “‘site-present,” could be assessed by the GIS, providing ongoing and 
interactive model performance indications. 
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BRIEF OVERVIEW OF THREE COMMON GEOGRAPHIC 
INFORMATION SYSTEMS 


Timothy A. Kobler 


MOSS MAPS 


MOSS (Map Overlay and Statistical System) 1s a GIS orginally developed by 
the Western Energy and Land Use Team, U.S. Fish and Wildlite Service (Table 
10.1, above). It has been in continual development over the past few years with 
cooperation trom the Bureau of Indian Affairs, the Bureau of Land Management, the 
Forest Service, the Geological Survey, and the Soil Conservation Service (Lee et al. 
1984). Thus, unlike most geographic systems it is in the public domain, although a 
superset of MOSS 1s marketed by Autometric of Fort Collins, Colorado, a firm that 
is also developing a more advanced GIS, based on MOSS, called DEL TAMAP (Reed 
1986). Most storage and processing in MOSS 1s in a vector or polygon format, 
although some raster capabilities are available. 


Additional raster capabilities, designed im part to allow the incorporation ot 
data derived from digitized umages, are available through the Map Analysis and 
Processing System (MAPS) subsystem, an extensively enhanced version of MAP, 
orginally developed at Yale University. To some extent, MAPS and MOSS can pass 
files back and forth. Input to MOSS 1s through MAPS; AMS, the Analytical 
Mapping System; or ADS, the Automated Digitizing System. Enhanced carto- 
graphic plotting, bevond the normal capabilities of MOSS or MAPS, 1s provided by 
the Cartographic Output System (COS). 


Beyond the general capabilities of geographic information systems as de- 
scribed earlier in this chapter, MOSS and MAPS have special capabilities that are of 
interest for predictive locational modeling of archaeological sites. These include 
routines that 


— collect a random sample of points, lines, or polygo.s for further analysis 
ot for input to statistical procedures 


— measure the distance between any two points along a path (which need 
not be straight) or along a straight line 


— determine the length of all lines of each subject (¢.g., first-order streams) 
in a line map or the total distance around cach subject in a polygon map 


— sdentify locations within a specifiable distance of a point, line, or polygon 
subject type 


— produce a three-dimensional display of any integer-valued contunuous 
map 


— ¢feate a may of azimuthal aspect or a slope map from a continuous 
elevation map 
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— cfeate a map showing the visibility of locations from a speczfiable obser- 
vation point or points 

— cfeate a cross-sectional image between any two points (usually this 
routine is used for elevation datz, but i 1s suitable for any continuous map) 


— cfeate a map showing the mmimum effort path to a target cell; the 
analyst can assign weights to various features acting as partial barners in the 
path-finding process (an example of 4 tairly common GIS capabulity for corridor 
analysts ) 

—  cfeate amap showing the steepest downhill path through varyimg terrain 
(essentially the path along which water would flow) from a target area 


The MOSS MAPS package provides very flexible routines for overlay and 
neighborhood analysis, map description, and data management. A principal advan- 
tage of this package 1s that it is usea and supported by numerous federal agencies, so 
that new features are being added to it at a rapid rate. At present MOSS MAPS has 
only very limited capabilities for inferential statistical analysis (Table 10.3). Versions 
are available for 16- and 32-bit microcomputers, mimicomputers, and mainframes. 


TABLE 1¢.3. 
Staustcal fuectons (beyond semple descriptive statistics) available in three commonly uulued 
systems 





+ um from WOSS MAPS IDLMS PICAR IBIS 
Supervised cluster analy srs x x 
Unsupervised cluster analy sts Xx 
Princepal components analy ses x 
Least squares analy srs 
Divergence calculations x 
Cross-tabulation x 


x 
x 
\ 





IDIMS 


Unlike the public-domain system MOSS MAPS, the Interactive Digital Image 
Information System is a commercial product of the Electromagnetic Systems Labor- 
atory, Inc., in Sunnyvale, Califorma. Like VICAR, which 1s discussed below, IDIMS 
is primarily an image-processing system; for this reason, data are organized im a 
raster format, and many functions that address problems specific to the processing 
of digital images, such as umage-enhancement routines, are available. Many other 
IDIMS functions are useful for more general kinds of spatial analysis, however, so it 
also warrants consideration as a GIS. IDIMS incorporates a data-entry component, 
the Geographic Entry System (GES), and the Earth Resources Inventory System 
(ERIS) for data base management and statistical functions (Electromagnetic Sys- 
tems Laboratory n.d.; Hansen 1983). 
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Special features beyond functions that are routine in an image-processing or 
geographic iniormation system, or that might be of special interest for archacolog:- 
cal locations analysis, include 


— aprocedure for overlaying up to 10 maps (or umages) at one ume, rather 
than the two at a time possible in MOSS MAPS 


— a procedure that passes a three-by-three-cell moving window across a 


land-cover map to create a diversity index 
— procedures for creating slope and aspect maps trom digital elevation data 


— aprocedure for creating a shaded relief map from a digital clevatson map 
with the sun in a speafiable location 


— a procedure for creating a proximal map that assigns each cell to the 
nearest given x,) location 


— vatous procedures tor generating random samples of images tor further 
analy sis 


IDIMS runs on a minicomputer and 1s used by several large tederal agencies. 
Hansen (1983) has discuss-d the creation of a“ genenc™ GIS through combining the 
most useful features of MOSS, MAPS, IDIMS, and their various allied programs tor 
data entry, management, and display. 


VICAR IBIS 


The Video Image Communication and Retneval (VICAR) system was devel- 
oped at the Jet Propulsion Laboratory (JPL) to process umage data from the 
planetary exploration programs of the late 1960s and 1970s (Bracken et al. 1983; Hart 
and Wherry 1984). Unlike MOSS MAPS and IDIMS, VICAR 1s designed to run on 
large-scale digital computers and 1s normally restricted to IBM systems, since a 
substantial proportion of its code 1s written in IBM 360 370 Assembler Language. A 
subset of VICAR, called mim-VICAR, was developed to run on DEC minicompu- 
ters, but it appears that this system is no longer actively used. A DEC VAX version 
of full VICAR is now in use at JPL, however. Like MOSS MAPS, VICAR 1s on the 
public domain. With well over 100 application programs running at about 25 
mstallations around the world, VICAR 1s a very powerful and widely utilized 
image-processing system. 

The Image Based Intormation System (IBIS) 1s an enhancement to VICAR also 
developed at JPL (Bracken et al. 1983; Zobrist and Bryant 1979). The IBIS programs 
give VICAR IBIS some of the capabilities of a GIS, including overlay analysis and 
vector-to-raster conve’ sion, which allows geo-coded imformation not normally 
available im raster (cell) tormats (such as maps) to be analyzed. 


The majority of VICAR application programs are specialized for wumage 
processing, a task that may sometimes be umportant im predictive archacological 
modeling — particularly when map-based data are unavailable, It 1s also important 
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to remember that most modern, map-based data are the result of umage mterpreta- 
tion of some sort. A few of the VICAR programs of potential mmportance to 
locational modeling include 


— amulnvanate classifier program using Bayes’s maxumum likelihood algo- 
nthm, which yields a classsfied umage (map) and, optionally, a confidence map 
tor that classification. This program accepts imput either from a sagerraed 
classificatuon analysis, in which the user specifies certain “traming afeas™ on 
which the classification function 1s to be based, or from an unsupervised 
cluster analysis 


— a multiwanate classifier program that uses a combination of parallele- 
piped and maximum hkelhood techniques, accepting mput trom esther a 


supervised or an unsupervised analysis 


— a program tor pertorming edge enhancements and, optionally, for mak- 
mg edge existence decisions 


a principal components analysis of up to 12 mput variables 


— a least-squares program that will, among other things, calculate and 
display trend surtaces and residuals trom trend surtaces 


a program that semulates the effect of shading from a specitiable angle of 
illumination on any continuous image 


The tact that VICAR IBIS typically runs on large maintrame computers has 
both advantageous and disadvantageous aspects. In installations with which | am 
tamiliar, VICAR IBIS runs as a “batch” program, meaning that jobs are submitted, 
and the output later (possibly much later) received, with no mtermediate mterac- 
tron between the user and the processing system. Obviously, it 1s desirable to have 
fast response to user query m an imteractive mode, as 1s typically the case for 
geographic imformation systems running on mim- of microcomputers. There 1s 
great analytical utility mm being able to see the mapping of some function unfold 
betore your eves, perhaps to be mterrupted and modatied if necessary m its early 
stages. On the other hand, some batch systems, such as VICAR IBIS, have a huge 
vanety ot sophisticated functions, and theit maintrame mmplementation allows the 
use of very large data bases. As new data storage technologies, such as laser disks, 
become available, and as the cost of data storage continucs to drop, one of the 
advantages of maintrame-based systems will disappear. On the other hand, as 
cheaper and more powertul local workstations begin to share processing with 
maintrames, the casy dichotomy between maintrame- and microcomputer-based 
geographic imtormation systems wall also become tuzzy, and batch systems will 
probably become things of ihe , ast. 
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MODEL DISPLAY VS MODEL BUILDING 


Timothy 4. Kobier 


Earber mn this chapter, Kvamme discussed many realized or potential apphca- 
tuons of geographac information systems to general spatial research im archacology, 
including the construction, testing, and use of predactive locatsonal models. Within 
the categor.’ of model building, a distinctnon can be made between the processing 
necessary to build a data set sustable for mferential statestacal testemg and the 
apphcation of inferential procedures (for example, linear regresson ) to discover the 
“best” locational model. It 1s umportant for managers to realize that, m thes present 
phase of development, most geographac information systems are much better suited 
to the first task than to the second. Constructing an inferential model of sate 
location inevitably involves the apphcation of inferential statistics to surveved areas 
that contain or are devoid of archaeological resources. Geograph miormation 
systems give unsurpassed power for the extrapolation of such models—onmce 
constructed—to the area trom whach the samples were ongmally drawn, but actual 
imferential scatistical functions available in many geographac mformation systems 
are rather lamsted ( Table 10.3). This us not a fatal weakness for the apphcatvon of a 
GIS tor model building uf the GIS has the abihty to format a file for use bw a 
general-purpose statistical package, such as SAS or SPSS, as us usually the case 
does mean, howewer, that a GIS ss usually not the only software needed tor 
analy sis cf spatial data 


Of course, gevgraphx information systems are an umportant aid in mode! 
construction since they can be used to collacs data to be passed to an mierential 
statistical analysis. As pointed out by Kvamme, anyone who has conducted a 
Quantitative settlement pattern analy sis—cxamuning the distances trom known 
archacologal resources and random pormts to vanous features of the natural 
environment and evaluating the composition of catchments around both sites and 
random pownts— knows how tedious and prone to error these hand measucements 
can be. In a GIS sustable tor archaeological analy sis, such measurements can be mate 
automatically for any of the available data planes or maps. These measurements 
constitute secondary surtaces that can be stored as new maps on whach the locations 
of sites and points without sites are replaced by measurements of catchment 
composition and distances to critical resources around these pounts. These mea- 
surements can then be passed to another system for statustecal amaly sis, and om thes 
manner the most tedious portion of model construction has been automated 
Perhaps when it 1s caser to connder vanables related to catchment values and 
distances to resources, these variables will be used more trequencly and eflectively 
im predictive locational modeling than they have been to date 
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IMAGINARY SESSION WITH A GENERIC GIS 


Timothy A. Kobler 


Despite the growing literature about and increasing accessibility to geographic 
information systems, these systems remain mysterious to most archaeologists. 
What follows is a poor man’s substitute for the only experience that can really 
convey both the usefulness and the limitations of these systems—a “hands-on” 
session. This example illustrates how a GIS might be used to map a model that has 
already been developed, either by using the GIS for data collection or by some other 
means. Limiting this example to mapping rather than development of a model 
should help the reader who has no acquaintance whatsoever with geographic 
information systems to understand how they work. Additionally, as pointed out in 
the previous section, building an inferential model is essentially a statistical task in 
which the GIS serves as a technical assistant for data collection and management. 
The specific techniques discussed are more appropriate for image-processing-based 
systems (such as IDIMS) than for many geographic information systems, and there 
would certainly be more efficient ways to approach this task on some systems. 


You sit in front of a high-resolution graphics terminal attached to a minicom- 
puter or a “supermicro” running a relatively advanced GIS. The most tedious and 
expensive work —digitizing various maps for the data base, correcting digitizing 
errors, geometrically correcting remote sensing imagery, tying that imagery into 
ground control points, and so forth—has already been done. Previous researchers 
have interpreted available Landsat imagery to yield digitized maps of vegetation 
type and density and of current land use. Likewise, digital elevation models 
available on computer tape from the USGS (Elassal and Caruso 1983) providing 
elevations for points at 30 m intervals have already been processed to yield 
secondary maps of slope and aspect. Each of these digitized maps has been stored on 
disk or tape and 1s accessible to the computer, and each constitutes a data plane ot 
theme. Themes available to you for our imaginary GIS session are shown on Table 
10.4. These data are available for an area about 51 km on a side, the largest area your 
monitor can display at a resolution (picture element, pixel, or cell size) of 50 m ona 
side. More than a million (1024?) pixels are displayed on your screen, which shows an 
area equivalent to that portrayed by about 16.5 USGS 7.5-minute topographic 
quadrangles. You can enlarge any portion of this area to fill the whole screen if you 
wish to see a subset of the area in more detail. 


Relatively low quality copies of the contents of the screen can be obtained 
quickly and cheaply in black-and-white on a peripheral dot-matrix printer; high- 
quality color copies can be obtained using a peripheral pen plotter or a high- 
resolution color ink-jet printer. The system at your disposal cost somewhere 
between $40,000 and $125,000 and so must be shared by many different users, most 
of whom are involved in natural resources inventory and analysis. 
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TABLE 10.4. 
Data themes available for your GIS session 





Description Organization 
Archacological sates with attmbutes tor type and age Polygon 
Aspect Cell 
Elevation Cell 
Extent of archacological survey Polygon 
Modern land use Polygon 
Roads streams with attributes for types orders Lane 
Soil type Polygon 
Slope Cell 
Vegetation types Polygon 
Vegetation density Poly gon 





You wish to map a simple site location model that predicts, for example, that 
frequencies for two types of sites will be relatively high in locations satisfying two 
slightly different sets of criteria. Requirements for the first site type are locations 
with 

— less than 5° slope, 


— no more than 8000 ft elevation, and 
— permanent water and pinon-juniper woodland no farther than 0.25 km 
away. 
The second class of sites is likely to occur in areas with 
— no more than 10° slope, 
—- seasonal or permanent water no more than 0.25 km distant, 
— no more than 7500 ft and no less than 6000 ft elevations, 
— arable soils no more than 0.1 km distant, and 
— locations at the base of a slope. 


You wish to create one map showing those areas most likely to have site types 1, 2, 
both, or neither. 


There are many ways to approach this problem; details of the “best” approach 
depend on the characteristics of the particular GIS at your disposal. One likely 
approach —ignoring technical details and considering only the general strategy — 
would be to select all locations for each site type on each data plane that are 
favorable to settlement and code them with a 1, coding all other areas with a0. Once 
this operation 1s completed for each relevant data plane (that 1s, for each map of a 
particular variable or environmental characteristic), the four data planes (in the case 
of site type 1) or five data planes (in the case of site type 2) can be electronically 
overlaid, with values from the same location on each map being summed together. 
This procedure is analogous to overlaying a series of accurately positioned and 
extremely detailed mylar maps to produce one new map in which cach location ts 
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the sum (or some other function) for that location of the information presented on 
all the overlain maps. The next step would be to recode all areas that yielded a sum 
of 4 in the first analysis to 1, with other areas assigned 0; all areas with a 5 in the 
second analysis would be recoded to a2, with other areas assigned 0. In this fashion 
two summary maps would be created, one for each site type. These, in turn, would 
be overlaid to create a final map in which any location with a | would meet the 
criteria for site type | only; any location with a2 would meet only the criteria for site 
type 2; and locations with a3 would meet the requirements of either type. Locations 
coded 0 would be considered not to meet the requirements of either type. 


With one exception, the processing to be done within any data plane prior to 
overlaying the separate data planes is simple and straightforward. For example, the 
process of selection according to a range of elevation and slope criteria relies on a 
very basic ability of geographic information systems to reclassify or ;enumber data 
planes. In the analysis for site type 2, for example, a new copy of the master slope 
map would be made in which all locations with a slope of 10° or less would be coded 
1, while other locations would take on a value of 0. 


Other basic GIS capabilities are illustrated by the operation of finding locations 
within a certain radius of some environmental feature or attribute (such as within 
0.25 km of seasonal or permanent water). One way to do this is to pass a “moving 
window” with a radius equal to the maximum distance allowable from the feature 
across the pixels that constitute the “electronic landscape.”’ Any point within the 
allowable distance could be flagged on a new map with a certain value, perhaps a 1, 
while other locations would take on a value of 0. Another method, which is usually 
more efficient, employs a function that expands the perimeter of the feature of 
interest by the proper distance. These functions create a concentric zone of 
specifiable width around a point, line, or polygon, an operation that is frequently 
useful in archaeological spatial analysis. One can, for example, specify a vegetation 
zone (pinon-juniper) to be used as a target; the width of the concentric zone to be 
created around any occurrence of this vegetation type; and the numbers to be 
assigned to locations within this expanded zone. In the example discussed above, 
this expansion function would be employed twice during the mapping of possible 
site type | locations—once on the roads streams data plane, using permanent 
streams as a target, and once on the vegetation data plare, using pinon-juniper as a 
target. 


The one exception mentioned above to the rule of relatively simple informa- 
tion processing within each data plane involves the problem of finding locations 
near the base of a slope. In most geographic information systems this would require 
a several-stage process (more complex than we need to describe here) that might, 
when completed, give less than perfect results. This example is included to 
demonstrate that not all results that are easy for a human to achi~ ¢ (as locating 
areas at the base of a slope might be) are necessarily easy toach. | .iaa computer, 
given current technology. 


The entire analysis just described might take a couple of hours with a large 
computer or a couple of days with a smaller one. In either case, the great time and 
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expense incurred in collecting and digitizing the data, once completed, need not be 
repeated, and users with different goals can p7ofit from the accumulated, organized, 
and highly accessible information in the GIS. Even with a smaller, slower computer, 
results are achieved much more rapidly and accurately than if the work was done by 
hand, assuming that the data base is in place. 





Ken Kvamme renterates his acknowledgment of those persons and institutions mentioned at the 
end of Chapter 8. Tum Kohler would like to thank Judy Ha = David Wherry, and Robert Wnght for 
comments on an carher version of his portion of this chapter. 
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Chapter 11 


PREDICTIVE MODELING AND ITS RELATIONSHIP 
TO CULTURAL RESOURCE MANAGEMENT APPLICATIONS 


One goal of this volume, stated in Chapter |, is to “explore the feasibility and 
practicality of predictive modeling for meeting management objectives.” We will 
address this goa! in the following pages. First, however, we need to consider just 
what our management objectives are, and how they relate to what might be called 
“research objectives.” 


Research objectives of modeling tend to fall under the general heading of “an 
improved understanding of the archaeological record.” Models can improve our 
definition and recognition of important types of sites and our understanding of their 
distribution across the landscape. Models can clarify processes of culture change 
and interaction and provide a regional framework for understanding the develop- 
ment and evolution of human systems. They can permit us to understand cultural 
adaptation to differing environments and provide insight into the nature and origin 
of social, political, and economic processes. 


While initially such information might seem abstract and removed from the 
practical requirements of cultural resource management, in reality it meets several 
critical management objectives. Management objectives are sometimes thought to 
be limited to a narrow concern over “how many sites are where,” and indeed, 
models can suggest what types of sites are in a specific area and where in that area 
they might occur. Some models can also be used to generate population estimates 
and statements concerning the probability of site occurrence in a particular loca- 
tion. These classes of information are important in management decisions about 
possible surface-disturhing actions. But the more research-onented objectives of 
modeling are also important because such models can help to indicate data gaps and 
highlight research issues needing additional work. In this way the use of models can 
help us to focus scarce agency dollars on the collection of the most necessary and 
important data and reduce waste caused by repetition. Such models can help us to 
learn more from existing data and, in some cases, can expedite and streamline 
inventory programs. While some products or applications of models are more 
important in either a research or a management context, in a broader sense research 
and management objectives overlap a great deal, and both stand to gain from a 
model that is reliable and adequately explains as weil as predicts site occurrences. 


Chris Kincaid 
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In the tollowing pages we will explore the applscatuon of specific theoretical and 
methodological approaches described in preceding chapters to modeling m a man- 
agement setting. The discussson 1s organized around the topics of preparing for, 
implementing, and evaluating 2 modeling project. Practical considerations are 
foremost. The goal here 1s to hghhght the benectits of using modeling m cultural 
resource management, while at the same teme mdicating some of the potential 
lumitatsons of sts use. The desured end result 1s a balanced and responsible apphica- 
tion of modchng concepts to management situations. Although discussons are 
keyed to land management issucs, we have tned not to lumi them to a single- 
agency perspective. 

Questions have been raised as to whether ventory and evaluation strategies 
employing modeling techmgucs mect the intent of the Natsonal Historic Preserva- 
tion Act, Section 106. Thos legslation requires a determination of the effect of 
tederal agency actions, or tederally permuted or hcensed actions, on all properties 
sted in or chgible tor hstong m the Nanonal Register of Historic Places. 


Under the provissoms of thr legislation, decrssons about appropriate imventory 
and evaluation strategees are made through consultation between the federal 
agency and the State Histone Preservation Officer (sometumes also mncluding the 
Advisory Council on Histone Preservation). There are no set criteria for deciding 
what 1s appropnate; rather, propnety 1s defined on a case-by-case bass through the 
consultation process, within the broad structure of the regulations. The decision as 
to whether or not modelong should be part of an mventory and evaluation approach 
depends on individual corcumst ances. A dectsson to use modeling comphes with the 
regulations if it was reached m accordance with the consultation procedures. For 
this reason, compliance questions are not addressed further m this chapter. 


WHAT ARE MODELS ABOUT? 


As analy tical tools, archaeological resource models are especially well-suited to 
apphcations in land management. Among other things, they identify patterns in 
spatial relationships between sites and thew physical locations and thus mdicate 
potential relationships bet ween the natural or social environment and the locations 
of past human activities. A causal relationshup rs envissoned: environmental factors 
influence where human actrvitves occur. Measurements that define or describe 
controlling aspects of the natural ot social environment are called independent 
variables, while measurements of affected human activities, observable m the 
archacological record, are called dependent var:ables. 


The development of models centers arownd three main tasks: classification of 
independent variables, classiticatoon of dependent vanables, and expression of the 
relationship between them. Since ditterent cultural groups mteract with each other 
and thei environment on ditlerent ways, the critical independent and dependent 
variables and thew relatronshep can vary widely from cultural system to cultural 
system. The goal of this kind of modeling ws to produce reasonably accurate 
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representations of selected mtcrrelationships for partecular cultural systems. A 
successful model or sencs of models allows us to organize what we know about 
sites—their function, location, and cultural affthatson— into a senes of affirmative 
statements about human behavior. Under controlled condions, these statements 
can be apphed to unknown areas to provide predactsons about resources bocated mn 
these areas. 


Our goal ss to correctly sdentify wmportant aspects of the natural or social 
environment that influenced the location of human activities, and to mterpret the 
archacological record as the result of a set of functsonal, temporal, spatial, and 
behavioral responses to a vaned environment. We may, m cfiect, iry to reconstruct 
the “rules” of imteraction between these two components. The relationship 
between sites and thei natural environments 1s not as casily discoverable or as 
direct as the relanonships among natural phenomena. Although governed to an 
extent by the demonstrably regular and consistent rules that apply to all lwing 
systems, human behavior 1s organized mto cultural systems, whoch exert additional 
influences on that behavior beyond those of natural forces. There 1s good reason to 
beheve that site locations cannot, m general, be fully predicted trom environmental 
variables alone. 


Because of the influence of cultural variables on human behavior, models of 


cultural systems are subject to many more sources of error than those for natural 
systems. The cultural rules that govern how human groups mteract with ther 
social and natural environments are not casy to identity, even tor modern cultures. 
Studying and identifying such relationships for cultures that have been extunct for 
thousands of vears 1s an even more dificult task. In land management apphcations, 
therefore, models of natural phenomena and models of cultural phenomena should 


not be considered equivalent. Managers need to have a realistic understanding of 


what models can and cannot do m order to use them effectively. 


WHAT CONDITIONS ARE FAVORABLE FOR 
MODELING PROJECTS? 


Conditions 


Before a decision is made to embark on a modeling project to satusty exther 
research or management objectives, several conditions must be met. Frequently, 
these conditions relate to corcummstances, such as the boundaries of the study area, 
tume, financial constraints, etc., that are beyond the control of the proyect manager. 
For example, the size of the potential modeling proyect area is umportant. As a 
general rule, modeling 1s not feasible tor small proyects covering less than 5000 to 
$0,000 acres. Models are most casily mnterpreted and understood if they relate m a 
deisned way to cultural boundaries of to mayor environmental zones. When only a 
small portion of a culture area or environmental zone can be analyzed, only a portion 
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of acultural system mught be examined. Observed site patterning in the study area 
may be responding to factors that are “uncontrolled” in the terms of the model 
because chev are a response to forces or events located outside the study area. The 
chances of developing an accurate and mterpretable model are greatly reduced by 
this carcumstance. 


In designing small modeling provects, difficulmes often occur m meeting 
mimmum sample-size requirements for stat-stical analyses. Altschul and Nagle 
address this problem in some detail um Chapter 6. In general, they advise that tor 
cluster analysis of site types, a manumum of 30 sample units (not including empry 
units) ts necessary, a condition frequently not met im sample inventones. Untortu- 
nately, the number of sample units mecessary for a valid analysis cannot be antacs- 
pated pnor to fieldwork. Additional inventory may be required to reduce sample 
vanance if a majority of sample unsts dc not contain sites while the remaming units 
contain many sites. 


The configuration of the modeling project area 1s also important. Linear as 
compared to areal projects are generally more difficult to model because lenear 


projects tend to cross-cut several environmental and cultural zones, cach of whach 


may be poorly represented as regards total acreage. More complex models, of 
additional models, may be needed mm these cases. 


Another umportant factor ss the amount of tume allotted for the modeling 
project. Modeling 1s useful as a long-term techmque for organizing and structuring 
data and data collection proritees. It 1s bess useful under a short tame trame that does 
not allow for testing and refinement phases. 


Often the nature of the archacological record mtself can indicate that special 
strategies will be needed tor modeling cfiorts. For example, of 75 percent of the 
known sites im an area are classified as undiagnostic lithic scatters, nether chrone- 
logically nor tunctsonally specific models can be developed. Under these corcum- 
stances, cate should be taken in designing any new sample inventores m the area to 
assure that detailed mtormation prrtamuing to attributes of artifacts « collected. 
This data could be crucial im the definition of site classes during postinventory 


modeling eflorts. 


Sometimes the environment determines whether modeling will be casy of 
difficult. In Chapter 4, Ebert and Kohler distinguish between environmental 
variables (which messure a angle aspect of the environment, ¢.g., slope) and 
ecosystem variables (which measure systemic attributes reflective of mteractron 
among environmental vanables, ¢.g., effective temperature, spatial penodicrty , and 
environmental diversity). The most usabi- ecosystem variables tor predicting site 
locations are those that monitor spatial availability of resources (¢.g., degree of 
patchiness) and temporal availability of resources (¢.g., degree of constancy, con- 
tingency, and predsctability). Ebert and Kohler conclude that, in general, hetero- 
geneous environments in which critecal resources are temporally predictable and 
occur im highly concentrated and overlapping patches are apt to be best tor 
locational modeling and prediction. Conversely, a basically homogeneous environ- 
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ment, m whech critical resources are dispersed and only sporadically avalabic, will 
be more dificult to model. For example, site locamons may be more difhcult to 
model on a desert creosote flat without major drainages or contrasting landforms 
than they would be on a flat broker by large dry washes or benchlands, where 
cTifecal Vegetative resources are apt © occur. 


Also, changes mm the earth's surtace may have taken place after the deposstion 
ot archacologscal materials (see discusmon of postdeposstional processes mm Chapters 
4, 6, and 9). Postdeposstional processes mught mnclude movement of sand dunes, 
depostion of alluvrum, cromon by wind of water, etc. Large portions of a project 
atea may have been covered over of scoured away as a result of these processes. In 
areas undergomg active alluviation, for example, exposed surtaces may be no older 
than 200-300 vears. The possibility of finding prehistoric sates under such condimons 
1s greatly reduced, and modeling eftorts directed to prehustorc site locations would 
be unproductive. Under these crcumstances specaahzed strategies (such as inven- 
tomes focused on road cuts, arrovyos, etc.) may be appropnate. 


Admunstrative Concerns 


The nsks of embarking on a modeling project should be evaluated realistically 
at the onset of a proyect and weighed agaist such admumustrative constraints as 
project schedules and costs. To develop a model that meets a specified level of 
precison, additional testing and analyses may be needed, sometimes causing delays 
and mncreased costs. Clearly, the umportance of these concerns will depend on the 
type ot modeling project envisioned and its use. 


Time should be allowed for model testing and revision during any project. In 
Chapter 6, Altschul describes a multistage survey design, a useful means for staging 
sample-based fieldwork so that the maxumum benefit us denved from cach succes- 
sve stage. While a multistage approach may seem more time-consuming, cxpe- 
reence has shown tha’ a sungle data-collection phase us seldom adequate tor model 
development (depending, of course, on the size of the mutial sample and the 
avaslabulity of relevant historical, ethnographic, and other data) and may be less 
ethoent over the long term. 


It a model ws being developed to reduce the cost of field inventory, vanous 
hadden costs should be taken mmto account. Short-term field inventory costs are 
almost always less tor partial coverage than for full coverage, even allowing tor the 
substantial field tume needed to locate dispersed sample units, but the cost of 
developing a predictive model 1s not muted to the costs of the sample inventory. In 
any given modeling proyect, tome and funds also should be allocated for such tasks as 


|. the detailed analysis of exesting imformation, 
2. preparation of environmental data, 


3. development and execution of successive phases of model testing ( using 
independent data), and 
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4. collection and processing of supplemental information about site variabil- 
ity (through various combinations of detailed recording, surface collection, 
testing, excavation, and laboratory analyses). 


Planning for these additional costs is not easy. Exact estimates as to the amount of 
work that will be needed to develop a model of the required precision cannot be 
made before a project 1s even begun. 


Perhaps the most cost-effective context for model development is within the 
framework of general planning by a land-managing agency or a local government. 
These programs can develop and sustain long-term approaches that are funded 
incrementally and result in cumulative and refined data bases. Such data bases, and 
the models based on them, may take years to develop and test. The end result, 
however, 1s a powerful and effective management tool. 


WHAT KINDS OF MODELS ARE THERE? 
WHEN DO WE USE WHICH TYPE? 


Models are classified in many different war’... In Chapter 2, for example, models 
are compared with respect to their focus (systemic, representing a cultural system; 
analytic, reflecting the analysis of archaeological data), their logical origin (induc- 
tively or deductively derived), and the level of measurement (nominal, ordinal, 
interval, and ratio scales). Figure 2.1 presents the structure of this discussion. 


In Chapter 3, intuitive models are distinguished from objective models on the 
basis of whether or not components can be operationalized or measured. Objective 
models are then broken down further on the basis of geographic precision (are 
predictions specific to points or to areas?), on procedural logic (inductive versus 
deductive reasoning), and on the relative emphasis given different variables. A 
summary of this approach is presented in Table 3.1. 


Chapter 5 contains a discussion of various statistical techniques that have been 
used to classify models (e.g., linear regression, logistical regression, and discrimi- 
nant function analysis). Kvamme, in Chapter 8, distinguishes between models 
based on trends in “‘location only” (defined solely in terms of locational coordinates 
x and y) and models based on trends in “locational characteristics,” or a wide range 
of environmental attributes of these locations. He further divides models pertaining 
to the characteristics of locations according to whether they are based on parametric 
or nonparametric statistics. 


How does the cultural resource manager know which type of model is best? Is 
it not possible to define one type of model that is best for cultural resource 
management purposes and apply this type to all situations? 


To understand the significance of the modeling terms used in various portions 
ot this volume, we should view them not as designations of types of models but as 
descriptive labels for various traits or attributes of models. A cultural resource 
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manager seeking the best model for his or her purposes must ask first, whar 1 the 
overall objective of developing a model? ft there is a need that only one type of model will 
fill, then clearly this type of model must be sought. More often, a manager simply 
seeks the most precise, detailed, extensible, and accurate model affordable. Such 
considerations as the nature of the existing data base, environmental complexity, 
etc. (see discussion in the previous section), ultimately intervene to limit the 


quality of the model obtainable with given data. 


One of the broadest and most useful classifications is the one based on 
procedural logic, distinguishing between inductive and deductive approaches. The 
relative merits of these strategies have been debated throughout this volume. 
Briefly, a deductive approach, 1.¢., one proceeding from theory to data, often 
explains why a model works. This 1s necessary, especially if the model 1s to be 
successfully applied to other settings. The mayor drawback of deductive models 1s 
the difficulty in making them operational. For example, deductive models often 
contain general propositions, such as “population growth leads to more intensive 
resource utilization.” The archaeologist must determine how “population” and 
“resource utilization” will be measured to show growth and increased intensity. 
Abstract concepts such as these may be difficult to measure ini tangible terms from 
archaeological data, especially if these data are sparse, as 1s often the case. 


In contrast, inductive models proceed trom data to theory; observed correla- 
tions in the data are used to formulate general hypotheses. If, for example, several 
major village sites in a particular area are kc cated near or on one particular soil type, 
one might hypothesize that large habitation sites tend to be located close to this 
particular soil type. Such conclusions may be readily derived through data analysis, 
but models that depend on them are often criticized for not explaining why the 
observed correlations occur. Most models developed tor cultural resource manage- 
ment purposes are inductive. 


Clearly, managers should understand why a model works, but in addition they 
need an approach that is operational. Joseph Tainter (personal communication, 
1982), in commenting on one of the initial drafts of this volume, offered the tollowing 
observations on this matter: 


The crucial question 1s not whether a model ws derived deductively or inductively, but 
whether it focuses on explaiming patterns or merely projecting them. Explanations can 
precede or tollow data cellectson, but must be developed at some pout 


One way of achieving this may be to structure the modeling process to be sure 
that both deductive and inductive phases are included. 


In reality, in the long-term t 1¢ frame of cultural resource management 
programs, the distinction between deductive and inductive approaches becomes 
blurred. The model buiiding and refinement process ts based on a continuous cycle 
of data collection, analysis, and mode! refinement. The results of one cycle of field 
testing and analysis are used to refine the model, which then guides the next phase 
of data collection. The eventual merging of deductive and inductive strategies may 
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be the direction of future modeling approaches in cultural resource management 
contexts. 


How docs a manager know which specific type of mode! is needed for a 
particular application? If a model is needed for a limited, short-term application toa 
relatively small project area (for example, in connection with the processing of a 
right-of-way or an energy development application), a relatively limited range of 
models will be appropriate. For more general, long-term applications in the cultural 
resource program, a wider range of models could be applicable and useful. 


Given the complexity of cultural resource management, virtually every type of 
model has some utility. Both inductive and deductive models are appropriate in 
varying degrees, depending on the circumstances. Deductive models have the 
greater utility in developing inventories and in such program activities as site 
interpretation. Inductive, correlative models usually have the statistical precision 
needed to develop quantitative estimates of site populations, densities, and distri- 
butions and are currently the better source of such estimations. Both types of 
models may be needed in a comprehensive cultural resource management program. 


HOW CAN WE PREPARE FOR MODEL DEVELOPMENT? 


Model development is a repetitive process of inventory and analysis that 1s 
most effective as a long-term strategy. In general, the quality of the model depends 
on the quality of the data; better data bases yield more precise and accurate models. 


Even before beginning the modeling process, the cultural resource specialist 
can take many steps that do not require large-scale or expensive sample inventories. 
Since the beginning of cultural resource management programs, managers have 
recognized the need to make full use of existing information. Chapter 7 specifically 
addresses model-building requirements and techniques to develop good data bases. 
As a first step, the cultural resource specialist should accumulate and screen all 
available information on the study area’s history and ethnography, and on previous 
survey work in the area. The quality of data on previously recorded archaeological 
sites and other historical properties should be carefully reviewed for locational 
accuracy and completeness, and sites for which information is poor should be set 
aside for later evaluation. Checking selected sites in the field may be necessary to 
evaluate recording practices and improve information. 


The second step should involve assembling information into a coherent, usable 
format. If the equipment and expertise are available, this might include automating 
the site data base. Several regions and states have systems for managing site data. If 
one of these is not available, a data base can be set up on an office computer. 
Automating the data base allows the specialist to review the data easily and 
informally, and to evaluate them apart from any ongoing modeling project. Ana- 
lyses of existing data for future modeling projects will be much simpler and less 
expensive than current methods. Subfiles can be used to store more detailed site 
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and artifact data specific to individual sites. These subfiles can be created easily and 
accessed as needed during detailed analysis. 

All surveyed areas should be mapped on base maps. The type and compiete- 
ness of survey coverage must be carefully scrutinized. Using information provided 
in project reports, the specialist should separate projects in which coverage appears 
to have been biased, incomplete, or otherwise suspect from those in which survey 
and recording practices conform to acceptable standards. Only survey data that are 
relatively complete can be used confidently in studies of spatial distribution. While 
data recorded during less rigorous, nonstandard surveys may be very useful for 
site-level analyses, methodological biases that distort apparent spatial distributions 
make these data unsuitable for modeling purposes. 


Documents summarizing existing data in an area, such as Class I inventories, 
state plans, and regional research designs, can be especially useful as a preliminary 
source of site distribution information within a study area. Research designs should 
summarize what is known about an area in the form of model-like statements or 
hypotheses, which can then be tested when new data are collected. These studies 
can be completed on a contract basis, generally at relatively low cost. 


The definition of site types reflecting temporal, functional, and cultural 
differences is perhaps one of the most useful tasks that can be performed to prepare 
for model building. (Procedures for this task are discussed in Chapters 5 and 8.) Site 
types or other similar classification schemes are one of the primary components of 
models. 


Environmental data are also needed for model building. To be useful, how- 
ever, they should be of a consistent quality and scale throughout the study area. 
Land-managing agencies typically expend considerable effort in collecting a wide 
range of environmental data for land-use planning and environmental impact 
considerations. This is done through field inventories, analysis of aerial photo- 
graphs and other remote sensing data, GIS development, etc. The manager should 
ensure that such data collection projects take into consideration the unique needs of 
the cultural resource program. These needs (e.g., for data pertaining to the 
paleoenvironment or identifying postdepositional processes) should be anticipated 
by the manager and, where possible, collected as part of other specialized studies. In 
areas of adjacent or mixed jurisdiction, opportunities for interagency development 
of environmental data can be explored to reduce costs. 

Once the requisite data bases have been assembled, screened, and organized, 
several kinds of preliminary analyses can be performed to evaluate and characterize 
the data. This step is actually the beginning of the model-building process, which 
will be discussed further in the following section. These preliminary analyses are apt 
to be biased and inaccurate, however, because the existing data used at this point 
probably do not represent the study area as a whole. 


This does not mean that trial models developed at this stage are unusable, only 
that their use is limited, and that they should be used with caution. Trial models 
provide a check on the adequacy of field recording procedures. Even an initial 
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modeling exercise may point out the need for additional detail in artifact recording, 
for subsurface testing, for an increased level of site examinations, for a shift to 
imterva! or ratio-scaled data, etc. 


Appropniate changes made in site recording techniques can greatly increase 
the chances of successful modeling efforts in the future. Even deficient models can 
help to identify gaps in inventory coverage and highlight new data needs. Similarly, 
future large-scale modeling efforts may improve substantially if preliminary small- 
scale projects can first be applied to areas for which we have relatively little 
information. 

Statistically representative data are not necessary to develop a model; if new 
data collection is planned for purposes of model development, however, these are 
certainly the most effective data to collect. Model testing, on the other hand, does 
depend upon the availability of unbiased data that are representative of the study 
area, most often data that were collected using some form of random sampling. 
Until a representative sample of data is obtained through a carefully designed 
inventory project, any model developed for the area must remain essentially 


untested and should be used accordingly (see the section on model evaluation, 
below). 


HOW DO WE PLAN A MODEL? 


Should funding become available for a modeling project, several measures can 
be taken to ensure that management needs will be met and that the project will be 
as successful as possible. To begin with, in what might be called a preplanning 
phase, the goal needs to be clearly defined. The purpose of the proyect should be 
carefully considered, recorded, and reviewed by managers and other resource- 
program staff members. Both long-term and short-term goals should be considered, 
including all potential management applications of the product, as well as imme- 
diate uses. The possibility of phasing the project over a period of years should be 
considered, depending on whether one-time or continued funding ts anticipated. 


Two wumportant decisions to be made are the size of the target study area and 
the type and resolution of desired model products. For large study areas, entering 
into joint projects with agencies or others (e.g., Indian tribes or local governments) 
who manage adjacent lands may be advantageous, especially if the combined land 
base more nearly addresses a meaningful cultural or environmental unit. Establish- 
ing two study areas may also be advantageous —a larger one to be used during the 
analysis of existing data and a smaller, more limited one to be used in definition of 
the target population of inventory sample units. 


The full range of available model products and their limitations should be 
weighed to ensure that initial expectations match the results. Possible types of 
products include statements about relative site distributions, population estimates 
(e.g., estimated numbers of sites in unsurveyed sample units, numbers of unsur- 
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veyed sample units without sites, and total site population), correlations between 
certain environmental factors and the locations of certain types of sites, and 
probability statements (¢.g., the probability of finding a site in any one sample unit 
or the probability that the observed result would occur by chance alone). 


In many cases, the value of the results depends on the detail of the environ- 
mental data recorded for each sample unit and the levels and types of measurement 
used in recording the data. These factors should be addressed during the planning 
phase of the modeling project and should reflect the management constraints, 
goals, and limitations identified during the preplanning phase. 


After the management needs have been clearly defined, a modeling project 
plan should be developed. The purpose of a project plan is to break down the 
modeling process into a series of review and decision points, whereby the manager 
or specialist and the mdividual implementing the project (in most cases a contrac- 
tor) can review progress and jointly participate in key decisions. 


The first step in the development of the plan should be a review of existing 
data and formulation of a trial model for the study area. Plans to test this trial mode! 
should be described in a research design that clearly spells out research issues, data 
gaps, and priorities for collection of new data during sample inventory work. 
Statements should address data-collection activities: selection of inventory areas; 
data recording for micro-and macro-environmental data; and recording of site, 
feature, and artifact attributes. Each recording activity should be carefully 
reviewed to ensure that the most powerful measurement system will be used, a 
critical factor if inventory results are to have the maximum applicability to model- 
ing efforts. 





Detail concerning the rationale for selection of sample inventory units should 
be provided in the sample design. The sample design should clearly describe any 
proposed stratification schemes and their goals, plus the configuration of the sample 
units and the method of their selection. It should also evaluate the need for multiple 
survey strategies (e.g., a mix of random and judgmental samples). 


At a minimum, the results of the mitial data review, the trial model, and the 
data-collection proposal should be included in a preliminary report prepared prior 
to initiation of fieldwork. The manager and specialist can thus determine at this 
preliminary stage whether maximum use has been made of existing data and can 
ensure that the first stage of field inventory is directed toward a model testing 
effort. Peer review of this report may be desirable. 


The plan should next address the second step of the proyect —the fieldwork 
phase. Detailed information about the proposed field methods, including rates of 
inventory, recording standards, collection strategies, and schedules, should be 


provided. 


The third step of the project plan should address analysis and preparation of 
the final report. In this part of the plan, proposed approaches to data preparation 
and analysis should be described. The relationship between proposed products of 
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analysis and the original management goals should be discussed, even though at this 
stage the former might be tentative. Ulumately, the results of fieldwork— 
including the number, variability, and distribution of sites—will have a principal 
role in determining the icvel of analysis possible. The anticipated artifact studies or 
other laboratory analyses designed to distinguish site types and functions should be 
described as well. 


The final report should be structured to present the model, describe the uses 
of the data, explain differences between the initial trial model and the final model, 
and describe its limitations and appropriate applications. An explicit statement 
should also be included to detail how the model could or should be tested, 
regardless of whether additional inventory ts envisioned in the near future. 


Specific technical information on the most effective ways of performing model- 
ing tasks has been presented throughout this volume; this information should be 
read carefully by any cultural resource specialist responsible for overseeing or 
monitoring a modeling project. A description of the overall modeling process from 
the perspective of the land managing agency 1s provided by Altschul in Chapter 3. A 
critical discussion explaining types of measurements and their importance to 
modeling is presented in Chapter 5, followed by an extensive treatment of the 
mechanics of the model-building process, including development of site classifica- 
tions. Specific topics such as sampling strategies, parameter estimation, the empty 
uns problem, phased sampling and survey, and data-recording strategies are 
covered in Chapter 6 by Altschul and Nagle. Kvamme, in Chapter 7, analyzes the 
use of existing data in trial model formulation. In Chapter 8 he looks more closely at 
different types of models and compares their output and applicability to manage- 
ment situations. Chapter 8 also contains a review of techniques for model testing 
and refinement, addressing such topics as parametric and nonparametric statistical 
analyses and assumptions about the data, testing, and confidence intervals. 


HOW DO WE APPLY MODELING IN CULTURAL 
RESOURCE MANAGEMENT? 


Many aspects of cultural resource management can benefit directly or indi- 
rectly from the use of modeling techniques. Even if a formalized model 1s not 
developed, the techniques used to prepare cultural resource data for a modeling 
exercise (see the section on preparation for model development, above) can have 
useful side benefits. Some of these are discussed below. 


Inventory 


Models can be used in the design of comprehensive inventones specific to a 
defined land base or land-use area. Within this area, information concerning site 
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types, site locations, and environmental characteristacs can be ordered as exther 
dependent or independent variables and (under an inductive approach) analyzed to 
detect patterning in the data. Through this process, additional data needs (¢.g., for 
more detailed site data, for inventory coverage in specific places) become evident 
and can be pnoruuzed. Even mm preluminary modeling efforts (deductive or induc- 
tiv<, New types of information, such as siteless locations, paleocnvironmental data, 
and information on postdepositional processes, are often needed. 


Models can also drive inventory efforts when information 1s needed to tar 
(rather than develop) the model. While models can be created from a diverse and 
not mgorously representative data base, they can only be tested properly using an 
unbiased data base, 1.¢., one that represents the study area as a whole. Because of 
the overall utility of models, the collection of sample inventory data for model 
testing and refinement should be a high inventory prnority, regardless of whether 
these data are selected within an administrative land base (¢.g., a resource area or 
forest) or a limited study area within that administrative land base. 


Because of limited fur ding, some managers may consider turning to modeling 
as a substitute for, or as a way of limiting, new inventory data collection. This 1s not 
a cure-all approach, however, because the results of a model are only as good as the 
data on which the model is based. For this reason, models sometimes do a poor job of 
predicting variability and may not be reliable or precise. While cach case must be 
evaluated on its own merits, there are several criteria that cultural resource 
managers should consider when deciding how to use modeling in field inventory 
efforts. One important consideration is the possible repercussions if scarce inven- 
tory dollars are spent to develop a model that cannot perform to the desired level of 
accuracy. The purpose for which an r ventory is conducted will determine how 
serious this problem will be. The type of model being used must be evaluated with 
respect to the application under consideration. The analytical origin of the model 1s 
important, as is the question of whether it has been tested. (More specific criteria 
for model evaluation are included in the next section.) 


The types of sites in the study area are also an important factor. It 1s one thing 
to limit inventory im an area thought to contain homogencous archacological 
remains, such as small sites with limited variability, no depth, and shared attributes. 
It is quite different to limit inventories in an area known to contain complex, large, 
or stratified sites; a heterogeneous site population; or what Altschul and Nagle 
(Chapter 6) refer to as magnet sites (sites thought to influence the location of other 
sites). 

The scale of resolution of the model is important. Zonal models perform 
differently from point models and generally cannot provide specific site-likelihood 
indications for designated locations. 

Land managers should guard against the improper use of intuitively derived 
models in influencing inventory efforts. Archaeologists who work frequently m an 


area often develop a “feel” for where sites should be found. Occasionally, these 
intuitions have been used as a basis for limiting inventory to certain areas without 





561 














KINCAID 





testing others. A danger in this approach 1s that if sites are sought only where they 
are thought to exist, the prediction may become a self-fulfilling prophecy. Potential 
results can include destruction of significant resources or introduction of a strong 
bias unto the data base. 


intuitions should not be dismissed, but neither should they be equated with 
scientifically verified information. They should be formalized, expressed in terms 
that can be measured and apphed in the inventory process, and subjected to a 
ngorous testing program. In this way they can be of vital umportance im effective 


model development. 


Evaluation 


In evaluating an archacological site for management purposes, two mayor kinds 
of questions are asked. One has to do with a site's significance (generally expressed 
in terms of eligibility for the National Register of Historic Places). The other has to 
do with determination of its most appropriate use(s), toward which further man- 
agement actions should be directed. Modeling can contribute to both kinds of 
evaluation. 

The significance of a site can be measured by its potential to contribute to our 
understanding of the historical and prehistoric past. On a broad scale, models can 
help to clarify these research issues, thus providing a more consistent regional 
context for site evaluation. 


By focusing research on the location of sites, as well as on the type: of sites 
expected to occur in specific locations, the modeling process can help to increase the 
accuracy and precision of functional, temporal, and spatial qualifiers. Modeling 
helps to define major similarities and differences among sites and reflects the 
information potential for both identified and projected sites within an area. In 
evaluating whether a particular site is potentially significant, the specialist often 
rehes on previous experience with other sites of the same type. 


The importance of a site cannot be equated solely with its membership im a 
particular site type or class, however; clearly, the rare or umique site, which fails to 
appear as a separate type during statistical analyses, may be the most significant site 
in an area. These sites are often not amenable to identification through sample 
inventory, but they can be successfully integrated into predictive models if suffi- 
cent information is known about them. This issue 1s discussed further mm the next 
section. 


It 1s umportant to consider the physical characteristics of a site as well as its class 
membership. For example, a broad class labeled “habitation sites” might include 
sites with or without structures. Some large sites might be in poor condition, with 
virtually no remaiming information potential, while some small sites might contain 
substantial undisturbed deposits. Clearly, significance assessment must address 
individual site characteristics, as well as class membership. 
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Detailed information about the relative scarcity, relative research importance, 
and locations of archaeological sites, whether they have all been discovered and 
recorded or not, can help in the determination of their appropmate uses. Examples 
of possible uses include ongoing or potential screntific study, maintenance of 2 social 
and or cultural group's traditional liteways, public educatson and interpretation, 
and experimental management studies. By refining ideas about all of the archacolog- 
ical remains in a target area, models can be extremely valuable for focusing and 
organizing use and allocation decisions. 





Protection 


Another smportant program component where modeling can be apphed 1s in 
the area of protection. Protection refers to measures taken to reduce natural or 
human-caused sumpacts to the significant qualities of cultural properties or to the 
attamment of their appropriate uses. Measures may include information signs, 
physical barners, patrol and surveillance, monitoring, detailed recording, excava- 
tion, stabilization, and administrative measures, such as access restrictions, with- 
drawals trom other land-use activities, avoidance during construction, etc. 


The principal way in which modeling can contribute to protection 1s by 
helping to establish prorities among sites for specialized treatment. In general, 
land-managing agencies are interested in protecting and preserving a “representa- 
tive array” of sites and site data. It 1s useful to visualize this array in terms of types of 
sites; all sites of a like type constitute a finite site pool. Theoretically, protection and 
preservation etiorts should be directed toward maimtaming a representative site 
pool of cach site type for future needs. Modeling provides a basis for determining 
the array of site types mm a particular area, as we have seen, and in some cases can be 
used to generate population estimates for various site pools. 


Models can also help to define research issues. This information can serve to 
guide data collection prorities for data recovery eflorts and can help to establish 
which sites should be selected for these efforts. Models can be used to identify 
project areas bkely to contain the types of sites most attractive to vandals, thus 
indicating prority areas for patrol and surveillance. 


Planning 


Perhaps one of the most valuable apphcations of modeling 1s i the area of 
planning. Planning for the management of cultural resources can take place during 
the development of land-use plans, environmental assessments, statewide or area- 
wide program plans, or site-specific plans. Models are especially suited to planning 
applications, because they focus on broad-scale, generalized trends, actions, or 
information. The main weakness of models, the mability to consistently produce 
detailed site-level specific statements, is usually not critical in a planning situation. 

















Modeling can help to plan how to reduce or anticipate adverse effects on 
cultural resources. For example, a model may predict the locations of sites that 
because of thes complexity of thei cultural or religsous value to a Native American 
group may not be susted to data recovery. On the other hand, a model may predict 
the locations of sites that arr suitable for data recovery of various kinds; estumates of 
potential costs and tume needed for data recovery can then be deriwed by projecting 
site distributions in the planning area. Potential long-term and cumulative umpacts 
to site pools by a proposed action can also be estimated, based on modcied 
populations of various site types. Modeling projections of so-called sensativity areas 
have been widely used for planning purposes. 


HOW CAN MODELS BE EVALUATED? 


There are realistic lemuitations on the level of accuracy we can hope to achieve 
in locational modeling, owing primarily to the complex nature of the behavioral 
factors influencing site location. Models are simplified constructs of a complex 
universe that are seldom clearly nght or wrong; rather, they are best wewed as 
being more or less useful. Often a model will excel in one application but fail in 
others. It ss umportant for a manager to know what criteria of success are most 
important to the proposed application of the model, before embarking on 2 eepdel- 
ing project. While at us unrealistic to expect models to work with “perfect” 
predictive accuracy, it 1s not unrealistic to expect to know how well a particular 
model works and why. Indeed, this information 1s critical in deciding how the model 
should be used. 


Several authors have discussed various criteria for model evaluation. In Chap- 
ter 2, Kohler presents an extremely useful discussion of inductive and deductive 
models, addressing their application, complexity, mternal consistency, ond preci- 
sion. The appendix by Thoms carries this approach further, extensively comparing 
22 models. 

Undoubtedly the most important criterion to consder in evaluating a model is 
whether or not it has been tested. As noted earher, untested models can be 
developed and formahzed by using existing data, much of which contains biases. 
Simply because a model is formally stated, however, one should not assume that it 
has been tested or that its performance has been evaluated. Without vesting or 
evaluation, a model 1s little more than a guess. 


One of the main reasons for testing a model is to control for spurious or false 
correlations between site locations and the environment in a particular sample. 
Such correlations can be minimized by reducing chances tor bias im che units 
selected tor mode! testing, i.c., by avoiding artificial constraints and by selecting 
sample units randomly. Consider, for example, a sample inventory in wich only 
sample units falling within 2 mi of a modern road have a chance of being selected. 
Analysis of site locations might reveal a marked correlation with geographical 
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vanables that actually have more to do with road engineering requirements than 
with past human settlement preferences. 


Establishing a control group to be used mn model testing has been discussed mn 
Chapter 8. This procedure ss useful for model testing because « provides a 
background or baseline picture of the study area as a whole, against which model- 
generated statements can be evaluated. If, for example, ot 1s noted that 9 percent of 
recorded sites are located within 5 mi of water, this observation could be wery 
sigmifacant. If 90 percent of a control group of siteless locations m the project area 
were also found to be within $ mi of water, however, the model would have told us 
nothing of significance about site locations. 

Another reason for model testing us to determine the nature and strength of 
relationships that may have been discovered. Often several environmental pheno- 
mena occur together m nature (a relationship known as autocorrelation). Testing 
can help us to understand whach of the co-occurring variables exerts the greatest 
influence on site location, and this information m turn permits us to evaluate the 
explanatory potential of the model. Perhaps the most obvious reason to test a model 
1s to determine sts overall accuracy rate. Accuracy rate and precision are generally 
inversely related. The more precise a model-like statement, the less accurate mt 
apt tu be. This relationship ts treated below m greater detail. 


The procedures used mn model testing have been treated extensively through- 
out this volume. Procedures for model validation and generalization are presented 
in Chapter 5. Vanous strategies for testing models based on existing data are 
presented in Chapter 7, along with techniques for integrating new data. In Chapter 
8 the discussion covers several quantitative methods that can be applied to data 
collected through some form of probabilistic sample and carry with them some form 
of rehability measurement, such as confidence intervals or probabilities. The gain 
statistic is suggested as a useful measurement for ¢ smparing accuracy rates arnong 
models. Three types of testing procedures are described im order of increasing 
precision. Two, referred to as split sampling and the jackknife method, are based on 
testing the model against some portion of the orginal data used to develop the 
model. A third involves collecting new and independent data from the proyect area. 


Several discussions of management concerns and model testing occur m this 
volume. In Chapter 3, Altschul distinguishes between wasteful errors (where a 
model predicts a site and none occurs) and gross errors (where a model predicts no 
sites and sites occur). In the latter case, the potential for imadvertent site destruc- 
thon in many management applications 1s increased. 

In Chapter 8, Kvamme discusses reduction of gross errors by adjusting the 
cutoff point of a model's decision boundary (a mathematical boundary), an approach 
that apphes only to quantitative models. As an example of the relationship between 
gross and wasteful errors, perhaps a model permits us to say that 80 percent of the 
sites in a study area will be located on 30 percent of the land surface im that area. This 
represents a substantial reduction in the amount of land surface to be addressed 
further, but it also carnes with it the potential for gross errors affecting 20 percent of 














the sates. Using the same model, but adjusting the cutoff pomt, we may be able to 
say that 9 percent of the sates will be located on 70 percent of the land surface. This 
reduces the msk of gross errors, while eccreasung the possibility for wasteful errors. 
The umphcatons of this discusmon for management apphcations are sgnificant. 
Reducing study area suze by 30 percent would represent a substantial and desrabic 


mctease in project efhoency, especially uf a could be accomplished with little or no 
msk to the resource. 


In Chapter 3, Altschul cautions against passing the powmt of dimmnushing 
returns en model testing. This occurs when substantial mcreases mn collection of new 


mventory data result mm bettle wncrease mm accuracy. There are many possible causes 
of this phenomenon, including the influence of such social factors as presence of 
large habstation utes, trade networks, and kinship groups, which overnde the 
influence of factors of the natural environment in determining site location and 
whach are not addressed im the modeling effort. 


An umportant conuderation for evaluating models us thew ability to take wnto 
account rare sstes. These sites constitute a very small portion of the ste population 
ether by virtue of thew own characteristics or by virtue of ther location m elation 
to the environment. A site type can be rare without being unpossuble to model; 
most models do not address these sites, however, because thei low numbers make 
most statistical techmques unusable. 


The rare-site problem increases when sample inventornes at low sampling rates 
are used to generate the data base for model development. When only a small 
percentage of the surface area us surveyed, the chances for discovering a rare site 
clearly are reduced. If any sites of a rare type are known im the study area, specialized 
iMventory strategies can sometimes be devised to increase the potential for discov- 
ering more of these sites. If large village sites have been found only m mpanan areas, 
for example, ryparian areas could be sampled at a higher rate than other areas to 
increase chances for discovering this type, and compensat+on for the higher propor- 
tion of mpanan areas surveyed m an otherwise random sample can be achiewed 
during later analyses of the data. 


Several other factors, any one of which could senously affect a model's validity 
and usefulness, should be taken into account im evaluating a model. The manager 
should carefully analyze the appropriateness of all statustecal procedures and analy- 
ses used in model development. Common problem areas include biases mm the 
sampling procedures, failure to meet statistical assumptions about the data, and 


mappropriate use of environmental data. Often, for projects mcorporating 
advanced statistics, the services of a professional mathematician will be needed. 


Models should be evaluated for thei completeness. Did they address changes 
in the environment through tume? Are there biases in the sample design that might 
affect the rehabulity of the data? Also, the resolution of the model 1s umportant. If the 


management need 15 for statements specific to quarter-section parcels, broad zonal 
models may not be useful. 
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Inductive or correlatiwe models have bmaed explanatory value because they 
do not account for observed correlanons between independent and dependent 
vanables. For example, empuncal analyses may Gemonstrate chat a certam type of 
sete on a sample 1s alwavs located within a lemuted distance of outcrops of a particular 
geologxc formation. While this information may be wery usctul m certam contexts, 1 
has not been demonstrated that the presence of outcrops actually sflecaed sate 
locations. Independent evidence, such as the presence of specialized features or 
artifact types, 1s needed to support such an wnterpretation. Tous docs not mean that 
the observed correlation is of 15 not vald; « means that we cannot explain why a 
occurred and thus we are no closer to an understanding of the broader cultural 
system that we are attempting to model. The utility of the model us lemsted to the 
observed study area. The need for independent testing to establish a noncomaden- 
tal relanonshop between the independent and dependent variables, m this case 
outcrops and sate locations, 5 especially emportant because of the strong tendency 
for autocorrelatson among environmental vanables. Correlative models are useful 
because they direct these independent tests. 


Field procedures are another factor to conuder when modeling projects are 
being evaluated. For mstance, the spacing of crew members and procedures for 
detining and recording sues can sgnuficantly affect the kinds of data that are 
a ailable for analy ses. Biases un field procedures should be exphcstly stated m project 
reports, and thei umpact on the results of the modeling etlorts should be evaluated. 

Finally, the mterpretability of the model us umportant. Is the model smple 
enough to be understood and explained im anthropological terms? Does ut relate 
environmental and site vanables to the everyday world? If not may not be usable 
by future researchers m a cultural resource management context. 





FUTURE DIRECTIONS 


Predictive modeling holds much promise tor cultural resource management in 
land-managing agencies, even though « 1s currently mm a highly expermmental and 
rapidly changing state. The imformation im this volume is nt wntended to hmut or 
confine this development; rather, the mtent is to crystallize weues and focus 
discussions on a common ground, to the benefit of both the agencies and the 
professional archacological community. 


At the present time, no mayor policy directives have been wsued by a large 
land-managing agency concermng the development and use of models in cultural 
resource management programs. Many would argue that such directives would be 
premature. Many others would argue, however, that modeling has ceased to grow 
and contribute to our understanding im the way that a should because of a lack of 
tocus and purpose m agency efforts. Altschul summarizes this concern mm Chapter 3: 


Perhaps the most sgmeficant crtecmm that can be made shout preductivwe modeling 
programs on mest cultural resource management contents ms that there «s mo comsensus as 
te the overall ebeciuve of these programs 
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Current efforts are seen as diffuse and lacking in momentum and direction. 
Rather than working toward refining existing models or developing new types of 
information or methods, agencies sometimes develop new models that suffer from 
the same limitations as previous ones. Short-term goals are being pursued exclu- 
sively, perhaps because long-term goals have never been clearly defined or because 
incremental, long-term funding has never been available. 


The information in this volume should help agencies to identify means of 
increasing the efficiency and effectiveness of their modeling efforts. The first, of 
course, is to develop the existing data base so that maximum use can be made of 
previously collected information. In addition, agencies need to augment the exper- 
tise of their staffs so that they will be able to evaluate and participate more fully in 
the modeling programs. This might involve specialized training courses in the 
evaluation and application of models and especially in the use of sampling tech- 
niques. Although advanced statistics will no doubt remain beyond the reach of the 
average staff person, some basic training courses in the types of models and their 
assumptions and requirements may be helpful. Only through this kind of staff 
development will agencies begin to use modeling effectively and creatively to direct 
and develop proyects meeting specialized data needs. Only through this process will 
modeling be used as a long-term strategy, where it can be most effective and 
efficient. 


There is a clear need to develop new ways to measure and define both 
dependent and independent variables. This can involve manipulation of tremen- 
dous amounts of information, for which remote sensing technology and geographic 
information systems are essential. Excellent and detailed discussions of these topics 
are provided by Ebert (Chapter 9) and Kvamme and Kohler (Chapter 10). Agencies 
should be aware of the potential contribution of geographic information systems to 
cultural resource modeling and make special efforts to ensure that the needs of the 
cultural resource program are met in the design of these systems. Because the 
potential contribution of a GIS is significant, consideration should be given to 
funding specialized research projects to explore possible applications of this new 


technology. 


Finally, agencies need to focus on the development of explanatory theory. The 
kinds of information that can be obtained through traditional cultural resource 
surveys are limited. Surface observations made during the course of these surveys 
are based on “best guess”’ estimates of limited types of information. While this 
information is useful in the formulation of ideas and hypotheses about prehistoric 
societies, qualitatively different types of information are often needed to develop 
and test explanatory theories. This information, on topics such as diet, environ- 
mental exploitation patterns, technology, etc., can often only be collected through 
subsurface testing and excavation, accompanied by detailed laboratory analyses and 
studies, and through analysis of pertinent ethnographic, historical, and other 
nonarchaeological data. These approaches involve additional costs and for this 
reason are often not included in standard inventory approaches. 




















CULTURAL RESOURCE MANAGEMENT APPLICATIONS 


In order to further the development of explanatory theory and to increase the 
accuracy and usefulness of modeling efforts on a larger scale, agencies should 
seriously consider sponsoring research projects designed to measure complex social 
and economic parameters as they apply to the archaeological record. In Chapter 4 
Ebert presents an excellent discussion of an innovative approach known as “distri- 
butional” archaeology. Here, traditional site types are seen as artificial constructs 
developed by archaeologists, which at best only poorly reflect behavioral systems. 
Analysis 1s focused on distributsons of artifacts across a landscape as they relate to 
larger patterns of land use. Experimental work using this technique has taken place 
already in several land management contexts, and it appears to hold much promise 
for future advances in explanatory theory. Efforts such as these should not only 
serve to advance the state of predictive modeling, they should increase the effi- 
ciency and effectiveness of cultural resource management programs as well. 




















Chapter12 


AN APPRAISAL 


W. James Judge and Daniel W. Martin 


In December of 1981 the Bureau of Land Management issued an instructional 
memorandum encouraging the development and use of predictive modeling in 
cultural resource management. Initial official interest in modeling by the bureau 
was in conjunction with the timely processing of “Applications for Permits to Drill” 
(APDs) for oil and gas. The oil and gas industry had recommended that the bureau 
initiate “regional reviews to identify areas of high and low probability for significant 
cultural resources, as a means for eliminating unnecessary surveys.”’ The assump- 
tion was that “‘given an adequate data base, infor™ed decisions can be made about 
where to concentrate additional identification anu protection endeavors, to the 
exclusion of certain other areas” (Burford 1981). 


The direction given by BLM headquarters at that time was as follows: 


States with heavy APD workloads are encouraged to consider developing predictive or 
sensitivity models for areas where it appears that cultural resource density and distribu- 
tion lend themselves to the approach. Any such efforts should be directed primarily 
toward areas with high demand, where there is also an existing basis for the expectation 
of a relatively low site population, regularity of site situation, similarity of site informa- 
tion potential, or other reasons for anticipating that the exercise will lead to a product 
that alleviates the cultural resource identification demands on BLM and industry, 
without creating an unacceptable nsk to cultural resources [Burford 1981]. 


In attempting to implement the memorandum, resource manayers found that 
predictive modeling was being employed in a wide variety of ways and that there 
was little mutually agreed-upon theory, method, or policy to guide the use of this 
technique. Asa result, a proposal was developed by the BLM to fund a project that 
would address these issues. The project was approved and funded, resulting in the 
production of this volume. 


The proposal established the following goals for the project: 


1. to evaluate trends in the development and application of predictive 
modeling critically, using knowledge gained through past research efforts; 


2. to explore the feasibility and practicality of predictive modeling for 
meeting management objectives; 
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3. to analyze and define the components of the model-building process, 
particularly with respect to cultural resource management; 

4. todevelop a set of standards for archaeological and environmental data to 
be used in modeling efforts; and 


5. to provide BLM field officers with information on data collection for 
modeling purposes and statistical manipulations of those data. 


The process by which the authors, editors, and advisory committee were 
selected and the lengthy course of peer and federal review to which the draft was 
subjected have been discussed in Chapter |. In this chapter we will(a) evaluate the 
volume in regard to its success in achieving the goals outlined at the beginning of 
the project, (6) summarize the results of the peer review, and (c) discuss what we 
consider to be several important issues raised by this volume. 


EVALUATION OF PROJECT GOALS 


In general, the five goals presented in the initial project proposal were realized. 
The first, that of critically evaluating trends in the development and application of 
predictive modeling, is thoroughly addressed throughout the volume. 


The second objective, that of determining the feasibility and practicality of 
predictive modeling as a useful technique for meeting federal management objec- 
tives, is addressed extensively in Chapter 11 and will also be discussed later in this 
chapter. We may note in passing, though, that to a certain extent apparent 
“success” in meeting this goal depends on how those federal management objec- 
tives are perceived. For some managers in 1983, predictive modeling was viewed as a 
technique that was going to rescue them from the burden of compliance with 
Section 106, permitting them to get by with minimal field survey and thus minimal 
expenditure of very scarce funds. To those individuals, the results of this volume 
may well be disappointing. To those who were looking for the satisfaction of more 
general, long-range objectives, the results will be received much more favorably. 


The third objective, that of analyzing and defining the components of the 
model-building process as they apply to cultural resource management, is also 
addressed in detail in this volume. It is apparent that model-building is a very 
complex and time-consuming process. Nevertheless, there is freedom of choice as to 
how to proceed with modeling, and some ways of putting it all together may be 
more effective than others, depending on the situation and the needs. Again, 
Chapter 11 offers step-by-step considerations to guide modeling efforts for those 
with land managing responsibilities. 

The fourth goal, to develop a set of standards for the archaeological and 
environmental data required to prepare predictive models, is somewhat more 
difficult to evaluate. In the literal sense, little in the way of a set of standards was 
developed by any of the authors. Their reluctance to provide a “cookbook” 
approach —which is implicit in the concept of standardization—is understandable, 














given the variability in modeling approaches and management objectives, as well as 
regional physiographic and cultural differences. If, however, we consider “stand- 
ards” to be a set of guidelines for data requirements in the model-building process, 
then the goal was met since the data requirements are standard in the sense that 
those agreed on as acceptable are presented in detail. For example, tolerable levels 
of error for data entry, choice of appropriate soil survey detail (e.g., Soil Surveys I, 
Il, 111), and appropriate cell-size choice for DEM (Digital Elevation Model) data are 
among the “standards” presented. Importantly, it is noted that each of the choices 
made must be tailored to a specific objective and phase of the modeling process and 
to specific regional circumstances. 

In one sense, groundwork for development of more standardized data 1s 
provided in this report. Perhaps the best way to establish such standards would be 
to develop them from data used in actual field and management applications 
Standards developed in this way would thus be based on actual management 
successes and would minimize the possibility of error. 


With respect to the final objective, that of recommending types of field 
inventory data to be collected and of developing specific procedures for field office 
use, only the initial part of this goal has been met in detail: recommendations 
regarding field inventory data are found throughout the volume. The second part is 
left quite open, again because of our reluctance to provide a cookbook approach, 
and also to enable field offices to pick and choose among techniques themselves so 
that local management needs are addressed by the most efficient means. 


In our view, then, the objectives of the project were effectively met, particu- 
larly when one considers the complexity of the subject matter, and the absence of a 
well-developed body of theory and method for predictive modeling when the goals 
were established. 


AN APPRAISAL OF THE REVIEW COMMENTS 


This volume benefited from extensive peer review. The invitation to review 
was extended to numerous organizations in order to create a document that 
represented participation from a broad spectrum of the professional archaeological 
community. Comments were received from the following organizations: Bureau of 
Land Management offices, State Historic Preservation offices, the National Park 
Service, the Department of the Army, the Bureau of Indian Affairs, the Advisory 
Council on Historic Preservation, the Bureau of Reclamation, the Forest Service, 
the Soil Conservation Service, the Society for American Archaeology, and a number 
of universities. The responses provided substantive comments on theoretical, 
methodological, technical, management, procedural, legal, and regulatory issues 
presented in the draft version. Even the most critical reviewers felt that the volume 
was an important contribution and should be published. 


Many of the comments suggested that the dichotomy between correlative and 
explanatory modeling was artificial and that the importance of explanatory models 
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was over-emphasized (especially as being superior to correlative models). Some felt 
that the dichotomy between the kinds of models was useful primarily in a heuristic 
sense, while others supported the research commitment to explanatory modeling 
but felt the important role of the correlative approach in the development of 
predictive models should be acknowledged. Some comments noted that in the 
normal scientific process such contrasted approaches are actually complementary, 
burt that the empirical search for patterns may well precede the quest for explanation. 


A majority of the reviewers felt that the report was too negative about the 
potential of modeling in CRM contexts. Most of the federal reviewers felt that the 
early chapters were unnecessarily “‘academic” or pedantic, and that more practical 
advice was needed (Chapter | 1 was not available with the review draft). Some of the 
polemic regarding distributional archaeology (Chapter 4), for instance, and the 
extended debate regarding inductive and deductive issues were felt to be of little 
value by this group of reviewers. 


Archaeologists with management responsibilities feared that the suggested 
potential of predictive modeling was too limited. They were looking for practical 
methods to provide better information about cultural resources in order to make 
realistic recommendations to management. Archacologists without management 
responsibilities appeared to fear that the technology, if allowed to go unchecked, 
would be applied by the government in an irresponsible manner. In this vein, 
federal reviewers felt that the orientation of the volume appeared to be toward 
archacologists without management responsibilities. 


All in all, the peer review comments, which themselves comprise hundreds of 
pages, proved to be extremely helpful in guiding the development of the final 
volume in a direction most useful to the diversity of the anticipated audience. 


THE ISSUES RAISED 


A number of key issues have been raised in this volume regarding the 
relationship between an emergent technology based largely in theory and practical 
everyday management needs. Here we will summarize four of the issues that we feel 
are extremely important to the topic of predictive modeling for both archaeological 
research and cultural resource management. 


The first issue is that of the complexity of the process; modeling past human 
activities is not a simple task. Humans, fortunately, do not behave mechanistically, 
and thus generalizations about their behavior are difficult to derive and can never 
be completely accurate. The relationships among humans, their activities, and past 
landscapes are very complex to begin with, and this complexity is increased by 
subsequent changes in those landscapes, by a depositional record that is both 
incomplete and complex, and by the difficulty of the quantitative methods that one 
must employ to model these relationships— methods that are frequently beyond 
the expertise of those who wish to use them. Modeling is a tool, but it is by no means 
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a simple tool and it is not a panacea. As a complex tool, its uses are limited, and it 
requifes expertise to implement correctly. As with any tool, modeling can be 
abused, and the value of the results diminishes accordingly. Used properly, how- 
ever, modeling can be of inestimable value to both the manager and the research 
archaeologist. This volume, we feel, presents the complexity of the modeling 
process well, and Chapter 11 details its appropriate uses in the management 


context. 


The second issue raised is that of the role of predictive modeling in the 
compliance process, that is, in efforts to comply with Sections 106 and 110 of the 
National Historic Preservation Act. This, of course, was one of the key concerns 
that stimulated the project in the first place. Managers were almost desperately 
seeking some way to address compliance problems in a cost-effective manner that 
would also protect the resources. Archaeologists may have felt that cost- 
effectiveness was taking precedence over resource protection, but many managers 
saw the situation differently. Shortly after the release of the BLM instructional 
memorandum noted at the beginning of this chapter, a project was proposed by 
BLM staff that was to use 


statistical discriminant analy sis techmques to develop a model to predict the probability 
of cultural resource occurrence from environmental parameters and evaluate the utility 
of this methodology as a tool in cultural resource assessment on potential oil shale and 
coal lease areas... . Once the model 1s developed and tested # can be turned over to the 
District or Area Office Archeologist where can be used operationally to predict the 
probability of site occurrence on nghts-of-way applications, access corndors and drill 
pad clearances. If im this stage high probabilities are present, the corndor could be moved 
to a lower probability zone. In other cases, the probability could be used to case the 
requirement to have a site visit prior to clearance [Garratt 1982]. 


Clearly, managers were having problems with the compliance process, and 
expectations that predictive modeling would solve or lessen those problems were 
high. 


In Chapter 11, Kincaid points out that Section 106 compliance decisions are 
made on a case-by-case basis through the consultation process, and that there are 
no set criteria for determining appropriate inventory and evaluation strategies 
apart from such consultation. In brief, there can be no “cookbook” approach to the 
role of modeling in that process. We can, however, summarize the value of modeling 
in the inventory process in general, whether it be for research or management 
purposes. 

Predictive modeling of archaeological site locations can never be a complete 
substitute for actual field inventory (intensive survey). As noted above, not only is 
human behavior too complex to permit this kind of modeling accuracy, but too 
many variables have intervened between the time that the behavior took place and 
the present to allow us to achieve through modeling the accuracy available with 
field inventory. For this reason, it ts unlikely that predictive modeling could, in the 
foreseeable future, be sufficiently accurate to satisfy the identification requirements 
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in 36CFR800.4 (the umplementing regulations for Section 106 of the National 
Historic Preservation Act; see also Secretary of the Interior's Standards and Guide- 
lines, Federal Reguster 48( 190):44721 -44723). By the same token, predictive model- 
ing 1s unlikely to satisfy the needs of a research archeologist whose research design 
requires accuracy at a simular level. 


Modeling can, however, provide research archacologists with estimates of 
probable site densities in unsurveyed areas, and this same capability is of great 
potential benefit to the manager. As noted in Chapter I1, the role of modeling m the 
planning process 1s perhaps its most valuable contribution. In the short term, for 
example, the ability of models to project areas of low site density or to indicate 
probable locations of sites not suited for data recovery can be extremely helpful to 
the manager, not as a substitute for inventory but as an aid in designing cost- 
effective inventory. 


Modeling’s greatest strengths, however, hie in its contributions to the long- 
term planning process. It 1s here that models developed with resource planning, 
interpretation, and evaluation in mind can be of tremendous value to the establish- 
ment of management priorities and to the integration of cultural resource manage- 
ment with other resource management responsibilities. Further, such model-based 
management can facilitate research, quite apart from the preservation and protec- 
tive responsibilities of the manager. Since a fundamental purpose of cultural 
resource preservation 1s to maintain the scientific potential of the resource, that 1s, 
to preserve its information content, modeling as a component of long-range plan- 
ning 1s of particular value to managers and researchers alike. 


The third major issue raised in the volume has to do with the theoretical basis 
of predictive modeling. Certainly the volume provides a critical summary and 
evaluation of current perceptions about the relationship between modeling and 
theory. Aspects of theory dealt with include examination of the systemic, archaco- 
logical, and analytic contexts, as well as site formation processes. Normative vs 
processual theoretical approaches as they relate to modeling efforts are also 
detailed. 


The most fundamental theoretical issue to emerge, however, 1s that of the 
dichotomy between correlative and explanatory models. This dichotomy arises 
trom the contrast between inductive and deductive logic, although the terms 
deductive and explanatory and the terms imductire and correlative ate not synonymous. 
Technically, models themselves are either explanatory or correlative; the terms 
deductive and inductive refer to how the models are derived and to the kinds of 
arguments involved in their implementation. Correlative models tend to be induc- 
tively derived (but not exclusively so), and explanatory models should contain 
arguments of both types. 


In Chapter 2 this theoretical dichotomy 1s discussed with respect to the various 
contexts in which archaeological investigations are carned out. 











The challenge tor snductiwe models us to build the bridge to the systema context by 
making the analytx methods (unchuding ducovery ) as “transparent™ (non-bias-makong ) 
as possible and by controlling for the cfiects of deposmional and postdéeposmtonal 
processes on the archacologsxcal context 


Deductiwe models, on the other hand, begun wath some theory predicting human 
behavior, om the systema contest. The challenge for deductwe models 5 to build the 
bridge to the analytx context, whoch «ss where the outputs of the system can be observed. 
This bridge-buslding—whether from the system« to the analytx contest of vie 
versa —ss referred to as explanation... . Explanatory models . . . are mherently nenher 
inductive nor deductive. Instead, they are models that attempt to burld the bridge 
between the dynamacs of the ling system and as observed outputs |Kobler, Chapter 2} 


As noted im Chapter 11, the contrast between correlative-inductive and 
explanatory-deductive modeling becomes somewhat blurred in field modeling 
applications. In actual practice, correlative models are generally easier to develop 
and in specific situations may be more accurate in their predictive potential. These 
models are criticized, however, for their lack of ability to explain the phenomena 
predicted. Archacologists are concerned about the explanation of past human 
behavior, and there is general agreement that we should not be satisfied with only 
the demonstration of correlations, but that we must also provide explanations for 
those correlations. Even if it 1s acknowledged that archacologists consider explana- 
tion to be the goal of modeling, however, a fundamental question still remains: how 
necessary is such explanation to the actual everyday management of cultural 
resources? This, in itself, is a key issue raised by this volume. 


As noted above, archaeological resources are most often preserved for their 
information content. There is no question that the inherent information can best be 
extracted through the explanatory process, and correlative models, because they 
are derived inductively, cannot contribute as much to the extraction of this 
information as models with a consciously explanatory onentation. But this 1s not the 
central question in cultural resource management. In that context we must ask, 
what is the best technique to preserve the resource? What is the most cost-effective 
means to achieve preservation, and to what extent 1s explanation necessary for 
effective management? By “preservation” here, we refer to the full complement of 
tasks involved in resource management, including discovery, recording, evaluation, 
conservation, and protection. There are no simple answers, but we may offer some 
comments. 


Basically the issue is this: should the manager select a correlative model, 
which is easier to design, takes less time to develop, and is initially more accurate, or 
should he or she plan to use an explanatory model, which is more complex and 
difficult to develop and may not be as accurate a predictor? At first glance, the 
answer would seem to be simple: go with the correlative model, and let archacolo- 
gists with research interests develop their own explanatory models at some time in 
the future. In that way, the resource will have been protected in a cost-effective 
manner. After all, management is under no legal obligation to provide explanation 
as part of the preservation process. 
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Yet the decision 1s not that straightforward. Correlative models are not 
ummediately “transierable,” that is, when developed for one geographic location, 
they do not necessarily work in another; there 1s no logical reason they should. The 
question then is whether it 1s more cost-effective to redevelop (or at least refine and 
reaffirm) the correlative model tor use in a new area or to develop the explanatory 
model in the first place, since the latter would be applicable in a variety of areas and 
would address other management needs (interpretation, evaluation) at the same 
tume. Ultumately, this question can only be resolved on a case-by-case basis where 
all the vanables to be considered can be evaluated properly. But certainly prior to 
investing time and funds in the development of an expinatory model, the manager 
must determine whether it actually would be as easily transferable as claimed and 
whether it will be accurate enough to satisfy resource preservation and protection 
requirements. We feel that research and management archacologists alike would 
agree that, if one has the time and funding, explanatory modcls will be more 
generally productive in the long term, and thus ultimately more cost-cflective. But 
such decisions must be made for each specific instance by managers, employing the 
best information possible at the time. 


One further aspect of the dichotomy between the two types of models 1s the 
supposition that explanatory models may serve management better in the process 
of site evaluation. There is little question that the determination of the significance 
of a site, or class of sites, may be enhanced by the deductive process integral to 
explanatory model development. Yet at times significance may have to be deter- 
mined on the basis of the resource’s potential, rather than the demonstrated 
contribution of information. This is true in archacology, where sites frequently 
cannot be excavated, and thus the information content cannot be fully demon- 
strated through deductive testing. In such cases, the potential significance ts 
assessed from surface indications, and at this level of evaluation, correlative models 
may be as effective as their explanatory counterparts in indicating a resource’s 
potential contribution to scientific knowledge. Again, the cost-eflectiveness of 
redeveloping correlative models for use in other areas may be the key decision that 
managers have to make. 


A fourth issue raised in this volume was tuat of the technology and expertise 
necessary to implement modeling effectively. Sophisticated hardware and sottware 
capabilities are requisite, as well as well-trained and informed individuals at all 


managerial and support levels. 


For example, it has become clear that successful application of certain models 
may require the use of a geographic information system (GIS). The quantity as well 
as the quality of analyses necessary require automated spatial analysis of data. 
Remote sensing techniques provide a source of data tor GIS analysis. The avaslabi!- 
ity of multispectral, high-resolution digital imagery opens up exciting possibilities 
for pattern recognition techniques presented in this volume. The dramatic leap to 
10 m resolution by the SPOT satellite is only the beginning; far more detailed 
resolution will be available in the future. The scale of measurement of the instru- 
ment has been one limiting factor, along with limited processing capabilities for 
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gigabytes of data. These technologies are improving, and the speed of this 
improvement provides an insight to the level of refinement we may expect from the 


modeling process in the future. The basic statistical, modeling, and pattern recog- 
nition theones, methods, and techniques presented herein provide the foundations 
upon which to build powerful new instruments of measurement and analysis. At 
present too few people in management and support positions have the requisite 
skills in geographic information systems (Burrough 1986), statistics, remote sensing, 
and modeling to exploit the technology available currently, let alone develop future 
apphcations. 


Another problem is that of obtaining access to the most capable systems and to 
adequate data bases. Access to such systems with diverse data themes and regular 
data maintenance 1s most readily available to persons who work for, or have some 
formal connection with, large land-managing agencies. Such systems require an 
organizational support structure difficult to justify for single-purpose analysis. 
Large land-managing agencies are supporting such systems on the basis of their 
utility to overall land management analysis. Included mm such support 1s providing 
quality software and hardware, software development, management, and various 
levels of staff skills, traming, and technical assistance. 


These are some, but by no means all, of the issues raised in this volume that we 
feel are extremely important to both research and resource management as they 
relate to predictive modeling. The issues that have not been summarized here may 


have equal significance in particular modeling applications. One of the purposes of 
this volume has been to bring a wide range of issues in the domain of predictive 


modeling to the fore. 


CONCLUSIONS 


Predictive modeling can clearly be a worthwhile component of cultural 
resource management, if for no other reason than that it imyects mgor imto the 
management process and serves to integrate management with archacological 
research. The process of modeling and the preparation and development of models 
are extremely valuable assets to management, regardless of the ultimate “success” 
of the models. 


After a thorough review of predictive modeling, this volume reaches some 
conclusions that contradict past attitudes and expectations held by land-managing 
agencies. The Bureau of Land Management's proposal noted previously (Garratt 
1982), for example, dealt with only a part of an overall process. We have learned that 
the application of “statistical discriminant analysis techmques” to environmental 
variables is not sufficient to develop a usable model. Certainly, the proposal made 
the process sound too easy and neglected much detail. We have learned that we 
must be sensitive to the facts and theories of site formation processes, and that it 1s 


necessary to incorporate theory from anthropology, archaeology, and other social 
scientific disciplines because site distribution is a reflection of human behavior 
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interacting with physical phenomena m an ecosystem. Again, calling attention to 


the complexity of predictive models and the modeling process 1s an umportant 
contnbuton of this volume. 


Further, we have learned that modeling us a cychcal process of ongoung 
refinement, rather than a one-time event, and thus models cannot be developed by 
outsiders and then simply “turned over™ to agency field office archacologusts for 
“apphcation.” For many reasons the field archacologists and managers need to be 
full participants in the modeling process. We can conclude that predictive model- 
ing, as defined and developed hereim, is potentully the most cost-effective way to 
combine sound management practices with valuable research programs. Both are 
necessary ingredsents for cultural resource preservation and interpretation mm thes 
country. 


It may well be that the most cost-cflective and appropnate manner for 
managers to implement the techmiques discussed in this volume would be to focus 
on the development of correlative models mitully and then work coward refining 
their accuracy. This will demonstrate the potential of modeling and its effectiveness 
as a tool for cultural resource management. But the correlative-inductive approach 
should never be considered an end im itself. Instead these mitial models should be 


specifically designed as integral components of the deductive approach to model 
development and as parts of the long-range planning process necessary to achieve 


the full potential of predictive modeling in resource management through ultumate 
rehance on explanatory models. 
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Appendix 


A SURVEY OF PREDICTIVE LOCATIONAL MODELS: 
EXAMPLES FROM THE LATE 1970S AND EARLY 1980S 


The purposes of this appendix are to expose the reader to a range of projects 
that have developed predictive models and to provide succinct comparative sum- 
mares of these projects. A variety of geographic areas, archacological manifesta- 
tions, and modeling approaches are represented. Twenty-two projects were judg- 
mentally selected from more than 100 reports. The longer list was not exhaustive; st 
reflected the interests of the authors of this volume and was generated by combin- 
ing lists of references provided by the authors and by the project advisory team. 


The projects summarued here represent a range of approaches and are sor 
lumatec «wo the best or most successful examples; indeed, be and mow auceuful are 
terms that would be difficult to define im a manner acceptable to all reaers. Projects 
employing sts.c-of-the-art approaches and some carher examples of predictive 
models are included, as are examples of the less successful approaches. Information 
about the characteristics of what may be unsuccessful predictive models can be 
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useful in providing the reader with a broad data base against which the usetulness of 


predictive models under a wide range of conditions may be evaluated 


Among the 22 proyects summarized here are studies from gay portions of the 
United States (Figure A.1), from projects in Delaware (Cust” et al. 1984) and 
Georgia (Kohler et al. 1980) to those in Washington (Meer? et al. 1981) and 
Alaska (Ebert and Brown 1981). The emphasis, however, us « on the western 
states (¢.g., Bradley et al. 1984). included in the sample are models that predict the 


distribution of sites that are veble on the surface (Larralde and Chandler 1981), of 


sites that are deeply burned in flood basins (Muto and Gunn 1980), and of mnundated 
sites on the continental shelf (Barber and Roberts 1979). Predictior Du dist nbu- 
trons are made for relatively undisturbed areas of the Great Bao. 2 1984) and 
tor highly developed areas along the eastern seaboard (Hasenst aly here are 
models for predicting the density of sites mm areas occupred by moive" "aerane 
hunters and gatherers (Jermann and Aaberg 1976), and models concerm# ore 
sedentary Anasaz farmers (W oodward-Clyde Consultants 1978). Vigeh of Mr | ome 
span of human occupation m North America is represented by these models. | here 
are predictions for the locations of sites occupied by the earhest inhabitants of the 
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Figure A.l. Map of North America showing approximate locations of predictive modeling 
projects discussed in the Appendix. 


Ozark highlands of Arkansas (Sabo et al. 1982) and predictions for the locations of 
recent Euroamerican ranches in the Salmon River Mountains of Idaho (Rossillon 
1981). 


The project summaries encompass deductively derived models (Thomas 1973) 
and inductively derived models (DeBloois 1975), including a deductive economic- 
decision-making model that predicts proportional use of the landscape (Hacken- 
berger 1984) and an inductive landform-analysis model designed to predict the 
general location of significant sites (Wildesen 1984). Some of the models can be 




















SURVEY OF PREDICTIVE LOCATIONAL MODELS 


tested with future survey data (Kemrer 1982), and other projects were developed as 
tests of existing predictive models (Thomas et al. 1983). Finally, the selected sample 
includes predictive models made by simple extrapolation from known to estimated 
site densities in large environmental zones (Plog 1983a) and very complex models 
developed using multivariate statistics and geographic information systems to 
generate probability estimates for site presence absence in areas covering less than 
| ha (Kvamme 1983). 


Once the selection of project reports to be summarized had been made, it was 
necessary to develop a list of attributes or variables that could be monitored for each 
report. The attributes monitored are (a2) project location and size, (6) inventory 
method, (c) analytical techniques, (4) the nature of the model used or developed, 
and (¢) the success of modeling efforts. The evaluation of each project also includes a 
discussion of other relevant topics introduced elsewhere in this volume. Toward 
these ends, the reports were examined in some detail. What might be called a 
“‘mental regression analysis’’ was performed to identify those variables that could 
be monitored with reasonable consistency and related to the topics discussed (and 
to the terminology employed) in the various chapters of this volume. On the whole, 
the terminology used here corresponds most closely with that utilized by Kohler in 
Chapter 2. 


The results of this survey of project reports are presented in two parts. The 
first part includes detailed information presented in a series of tables designed to 
facilitate comparisons of the various approaches. Summaries of each modeling 
project are presented in the second part, along with a few brief comments about the 
approaches used. Comments focus on the relationship between modeling objectives 
and results, as well as on innovative aspects of the methods employed. The overall 
discussion ends with some general observations about the nature of predictive 
modeling as represented primarily by the selected sample of project reports. Some 
of the comments are particularistic because they refer to a given aspect of a specific 
project. Other comments about a given project are made because that project is 
characteristic of a general approach to predictive modeling. 


TABULATED SURVEY RESULTS 


Descriptive and evaluative information about the reviewed projects is sum- 
marized in tabular form. Table A.1 provides information on general characteristics 
of each model—location, type of model (inductive or deductive), objectives, 
claimed accuracy (high, low, or percentage estimates), mode of presentation 
(tables, maps, charts), and verification approach (how the model was tested). This 
table also includes a general assessment or evaluation of each model. The evaluation 
criteria—falsifiability (can the model be disproved?), consistency (is it mathematically 
and logically sound?), simplicity (is it parsimonious?), and generalizability (can it be 
applied to other study areas and to human behavior in general?)—are essentially 
the criteria defined by Kohler in Chapter 2. An assessment is also made as to how 
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thoroughly the environmental and cultural data were evaluated before they were 
used in the model. This assessment includes such questions as whether there was an 
effort to reduce redundancy, whether the reliability of map-based information was 
discussed, or whether other statistical techniques were considered. Results of this 
systematically judgmental assessment are presented as scores on a scale from | 


(lowest) to 5 (highest). 


Table A.2 characterizes the models in terms of their general data bases and 
predictions. it presents information regarding the kind of sampling procedure used, 
the number of sites or cells included, and the size of the cells, transects, or grid units 
used to subdivide the sample. Levels of measurement (nominal, ordinal, interval, or 
ratio) used to define or describe environmental variables are also listed, as is the 
nature of the predicted resources (site type, site density, or site presence). The 
manner in which the survey area was classified into landforms or environmental 
types and into site density zones or site present absent units is also summarized. 
The spatial resolution (e.g., block areas, landforms, grid units of various sizes) of the 
predictions and the nature of the predictions (e.g., site density, site presence, site 
significance, or site type) are characterized under the heading “Resolution of 
Predictions.” An evaluation of the thoroughness of the procedural discussions in the 
report is presented as a score on a scale from i (lowest) to 5 (highest). 


Information related to the environmental variables used in the models 1s 
presented in Table A.3. The listed physiographic divisions within which the 
projects are located follows Hunt's (1974) classification. Major types of contempo- 
rary land use are also listed, as is the size of the project or study area (i.c., the extent 
of the spatial population for which predictions are made). Environmental variables 
used to classify or to subdivide the project area (e.g., landform type, soil type, 
distance to water, elevation, and slope) are listed, as is the source of that information 
(e.g., various kinds of maps, field observation, and literature search). The modeling 
projects are rated from | to 5 assessing (a) the degree to which changing paleoenvi- 
ronmental settings are considered and (6) the degree to which the effect of various 
depositional environments on the discovery of cultural resources and or on our 
understanding of past human behavior is taken into account. The same scale of 
ranking is used to assess the level of discussion about the ecosystems within which 
humans operated. In other words, the scale provides a comparative measure of how 
well the investigators discuss the spatial and temporal distribution of food resources 
that may have been used by past groups of people. 


Cultural variables used in the modeling projects (¢.g., site type, site size, 
artifact feature types, or simply site location or presence absence) are summarized 
in Table A.4. The culture area designation follows Driver’s (1961) scheme. Termi- 
nology used for known and predicted site types usually is taken from the referenced 
report. The sources of information about these cultural variables are also tabulated. 
The models are assessed on a scale from | to 5 according to the level of consideration 
given to understanding the human land-use systems represented by the debris on 
or in the ground. 
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Table A.5 characterizes the nature and results of field investigations con- 
ducted to develop or test the models. In some cases fieldwork was not part of the 
modeling project; rather, existing survey data were used to build and or test the 
models. For those projects for which new data were collected, information ts 
provided regarding how the field data were used, the size of the survey area, and the 
general methods used to discover and or record the resources. Some of the results of 
the fieldwork—number, types, and densities of sites discovered in the survey 
area—are tabulated. The general nature of the fieldwork is assessed by evaluating 
the reports (again on a scale from 1 to 5) according to the thoroughness of the 
discussion of constraints and limitations imposed by field methods. For example, 1s 
there a discussion of the kinds of sites that potentially remained undetected when 
subsurface deposits were not exposed (¢.g., by clearing of duff or leaves, digging of 
test pits, or cleaning of existing cutbanks)? Did survey strategies result in the 
detection of the full range of known or theoretically expected site types? What were 
the effects of excluding areas from the survey or of arbitrarily distinguishing 
between sites and isolated finds on the basis of artifact density? 


Project reports are listed in chronological order in the tables and im the 
following summaries in order to afford the reader an opportunity to assess develop- 
mental trends. They span the time period from 1973 to 1984; 15 of the 22 were 
published after 1980. Reports that were published or printed in the same year are 
listed in alphabetical order. 


SYNOPSIS OF SURVEY RESULTS 


The summaries presented in this section provide a brief synopsis of modeling 
components of the 22 project reports. This information is intended to fill in some of 
the gaps in the tabular summaries and to provide coherent descriptive statements 
for each model. Additional information is also provided about the institutional 
affihation of the investigators and the funding agency for each modeling project. 
Attention 1s drawn to any special qualities or potentially undesirable aspects of the 
models. The concluding paragraph in each synopsis is essentially a narrative 
assessment of how well the modeling project achieved its stated or implied goals. 


Reese River Ecological Project “An Empirical Test for Steward’s Model of Great 
Basin Settlement Parterns."’ David Hurst Thomas. American Antiquity 
38:155-176. 1973 


The Reese River Ecological Project was conducted by the American Museum 
of Natural History and funded, in part, by the National Science Foundation and the 
University of California (Thomas 1973). It is one of the few research projects, as 
opposed to cultural resource management projects, selected for summarization. 
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TABLE A‘. (Continued) 


Summary of general characterwsacs of the selected predactwe model projects 
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Summary of characteristics of data base and predictions from selected predictive model projects 
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TABLE A.5. 


Summary of general characteristics of field investigations conducted im conjunction with selected predictive model projects. 





Use of New Fadd Dats Sexe of Survey Arce 


Reese Rewer Ecologscal Prot Thomas 1973 


To provsde archacx vhogacal veritic at aon 730 ha 
ot Sueward’s mode! 


Elk Radge Project (DeBloon 1975 
To provade data base tor a test of +5 ha 
random sampling mm archaeology 


Lake Koocanusa Project (jermann and Aaberg 197% 
Teo provade data base for predictung 338 ha 


site densities mn unsurveved areas 


CO>2 Project (W oodward-Clyde Consritants 1978; james ct al. 1983) 


Exssteng data used to build model, 208 ha 
but 140 cells were “vested” 


Contin atal Shelf Project (Barber and Russell 1979) 
Exurstung data used to build models NRO 


Fort Benning 4000-Acre Survey (Kohler cr al 1980) 
To provide data base for mode! 1619 ha 
construction 


Tombigbee Early Man Project (Muto and Gunn 1980) 

To test empincal model and develop NRO, but 5% localetes 

imitial predictive mode! tested, most between 15 
and 9 m in diameter 


Serve; Pra cdure 


Survey of grid unas (235 ha) at a rate of ca. 62 
ha per person day; mtenswve surtace 
collecnams, no sv stemata subsurtace 
ChaMunat sons 


Survey of wamous kends of landforms or 
parallel transects; survey rate mirneny NRO; 
total collecton of ceram>, lthucs, «ad 
orgamx remams at small sates, and wathon 
transects at large sites; no systematx 
subsurtace cxiamination 


Surveyed 806 m wode tracts m paralic! 
transects 30m wode at a rate of ca. 6.8 ha per 
person day, surface callectson (grab) of some 
temporally diagnosta artedacts, no systematx 
subsurface cxamunation 


NR), all 140 randomly selected cells (ca 
ete “wessted m the Geld” as a means 
‘acation of the model 


NRO; vanous surveys with many dullerent 
methods 


Survey of 30 m wide transects at an overall 
rate of ca. 8 ha per person day, mcluding 
systematic subsurface examination and a total 
collection of all matenals on the suriace 


Surface survey, contour map drawn with 20 m 
gnd, random selection of 15 - 25% of gnd unut 
intersections for subsurface testing with 
auger core, judgmental locations and off-swte 
locations also tested 





Drvaenon of Conwtravat 
Sate Nasuber, Type, Demat; Svale 1-5 


NRO*, but 97% of all materials asugned to i 
Medahermal penad, between 2500 BP and 
hitonmcal temes 


Overall ate densay ss 1 ste every 7 ha; ca i 
77% of sues had ceramics and were assgned 

to Basketmaker (Ti - 1) or Pueblo (1 - 1V 

penads 


Overall prebustonc ste densty of | ste every 4 
16 ha, prehestora,, historncal, and stone feature 
suites were recorded 


NRO, but the field vise results were such 2 
that “the standard error of preduted-to- 

observed value was sdentical to the standard 

errot of the model” (James ct al. 1983-23) 


NRO, but ste density and types are highly 5 
variable wiethen large areas 


Depending upon sie hkelshood stratum, 2 
density ranged from | ste every 10 ha to | 


every 119% ha; sites and or wolated finds 
represented carly Archax through hustoncal 
penods 


NRO, but 62% of tested localines yeelded 5 


cultural matenals, mcluding proyectile poms 
representative of all culture tame penods 
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_ 4 


ace 
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Lamuted wet of cxssteng site hic Gata 


to bund model 


Besty-Star Lake Progect (Kemrer 1962 


lo test and retene made! 





-_ 
4437 ha 
a a” 
72.49 ha or 
zhout 15 fi propect areca 


None related to mode 


479 ha 


Orark-St. Frances Natonal Forest Propect (‘Sabo ct al. 1982 


barsteng data used to construct of 
test modes 


Passax Rewer Basen Provect |lasenetabh 198! 
le test the meade! dered wang (15 


lata an . sTiate arialyen and t 


*\ Rr) Try beer em at pert \ ‘R 7 Oba 


ned tram reterenced noerce 
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“EST COPY AVAILABLE 


Survey mcthecs vated Comwcdctany mm ther 
mmc survey Macks; tx ) siecmal suMmulta ¢ 
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Survey of 15 m wude rane octal taics 
45-4 ha pct person a ¢ 

" e ; ~~ 
sictched m tric, grab sampic hagrmcsts 
atitviact 
NRO camous procedures tor many diticre 


NRO. but results of archowal searches and 
some reconnamnsance-ievel survey work were 
emploved to build mode! 


Survey of 15-30 m wede transects to acheeve 
“an mntensive 100°) coverage”; survey rate 
WRO?. promt -provemence collection at umigue 
artstacts of matenals requiring further 
laboratory sdenidication, no svstematx 
subsurtace cxammatoor 


Various procedures tor many diflercnt surveys 


and site miormation reported by amateurs 
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wsalated finds; overall sere denssty was | sete 
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NRO) tor mndis sdual surveys. burt of the 316 . 
COMPONENT » wth reasonably rchable 
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TABLE A.5. (Continued) 


Summury of general characteristics of field investigations conducted in conjunction with selected predictive model projects. 





Grand Juncuon Resource Arca Propet Kammer 198) 


le Dudd predutwe made 667% ha 


Kasbab (K) and Cubs (C) Study Arca Proyects ‘Miez 198 5a, 198Tb 
K: baxrsteng data trom onc sere 
used to build model; data trom other C:- 3477 ba 
cwrvess usta to test moadec!. 

C: Exwting data used to assess 

potential of predative modeling 


Fort Benning 2200-Acre Survey Project (Thomas ct al 1083 
Te test and -chne cxrsteng model and #9! ha 
to develop a duscremmant analyses 

madel 


Cusco Desert Project Bradicw <t al 1084 
lo budd a predutiwe mode! 1619 ha 


Route 15 Relief Corrdor Project (Cause: ct al 1084) 
Farsteng data used to construct model None related to made! 
and compare results with those of 


other models 


Momane Hunter-Gatherer Project (Hackenberger |084 
Exesteng data used to estemate the None related to model 


potential of archacolagucal data to 
test the economn decison model 


E-NRO), perhaps 5 ha, 


Seertry Prem der 


Surwes om paralicl transects 20-25 m wade at 
rac. of appramately 16-22 ha per 

pers dw. no collects aTidats \arard 
am tert, same eecte shetc hed, ne “4 steomatx 


sumurtac cismeniast son 


K: NRO), probably ddictomt methads tor 
difierent surveys, 
C: NRO, peshaps dierent methods tor 
dittercet surveys 


Surecy om parallel transects 30 m wxde; 
systemata clearmg of torest beoer and 
subsurtace cxramunation, cabkectioms tram 
shovel tests; systemata surtace cullectson, 
completion of torms tor nomete areas, survey 
rar NRO) 


Survey om parallel transects 15 m ede; 
callected samples (grab) ot diagnestx 
artefacts and “obwdian and chert source 
material”, no svstemata subsurtace 
Chatmnation, survey rate NRO 


Various surveys wrth vanouws methods 
resulirng om vanous qualters of data 


Various surveys with various methods 





“nr ys Dn 
| ta sw, opr. set tc anda 
rom & vt’ T. *reerr recs ata clé 
ims site eve f2 ha 


K Sureews sarided 16 sacs wath dcmens 
rangeng trem tetal abermor to | ewery 43 ha 
Gcpendiung on dramage vegetation rane, 
C-Seurwews werhded 142 etes wath overall 
Genety of | ane ewery 26 ha and a mage top 
Genety of | ate ewere 57 ha, al 

represem ative of (.allma phaw 


Total of 37 anes and 12 welseed tends, 

mc ludsng 20 pretustorn sates, 15 hosters a 
wes, and 2 prehustorn hestercal sacs, overall 
ete Gemety « | ete ewery 26 ha; pre trest orn 
ome Gemesty « | ate ewery 405 ha 


Total of 11! prebuster ses, dagnostx 
artelacts represematwe of dl culture tome 
perads, 44.1 cerame and of bthw scatters, 
2.7% camperes, 5.4% rockshebrers, 19 
quarry sues, overall prehestorn sme density os 
leme ewery 46 ha 


CF 190 known prebestor sites 99) Archan, 
& 7% Weedland 1, 20.7") Woodland il, 21 .¢ 
unknown, for generaheed tunctional types 
7.9 macrebaad wtes, 26°) mucrohend etes, 
16°) precurement sites, and 90.7 unknown, 
ovetall known prebustore ste dementy ws | ste 
every 485 ha 


NRO, mention upland and rver valley canyon 
wes as well as rackshebrer and cave etes and 
thew potential to yeeld data tor testing madets 
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Tar Sands Project (T pps 1984 
To develop, test, and refine models 


Central Oregon Progect (W ddes» 1984 
Farsteng data used to Gcveiop model 


70 ha 


None related to made! 


Survey m parallel tramsects 15 m wede; bmoed 
systemsix subsurtace cxamemnatsons 
“probed”, suc forms, sketch maps, toals 
drawn, collectson plottimg of potentially 
dhagnosta artelacts; sutwey rate of about 13 ha 
pcr person day 


NRO, but probably numerous surveys using 
different methods 


Total of 167 components (155 sacs): 5.4 
Euroamencan, 1.8: Numa, 10.2% Anasaz:, 
34’. Kremom, 17.4% Archasxc, and 61.7 
unknown, of the 158 prehistorx 

components: 15.2% hmaed actewsty sates, 
05% Geid comps, 31.0% basecamps, 6.3% 
habutateon camps, overall sac Gensety os | sete 
every 4 ha 


lotal of 3646 sacs m study area; culture ome 
pened miormation and tunctwnal typology al 
data NRO; sce density tor 259 ha (660 acre 
afeas ranging trom | sue cwery 2599 ha to 

| ste ewery ha 
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Basin and range topography, and sagebrush flats and pinon-juniper woodlands 
are charactenstic of the 77,730 ha project study area in th: upper Reese Raver Valley 
of central Nevada. Steward’s (1938) ethnographic model for the Reese River 
Shoshone subsistence patterns was tested using archacological data. Ethnograph- 
scally derived seasonality, resource use, activity, assemblage, and settlement infor- 
mation was quantified, and the resulting data were used in a computer-based 
simulation model. Initially the model was used to predict the nature of food 
procurement and maintenance activities in different environmental zones. Ulu- 
mately, the model aiso predicted artifact and feature distributions and densities in 
four “lifezones.” 

The sumulation-generated predictions were tested using new survey data. The 
project area was stratified on the basis of vegetation communities or microenviron- 
mental zones that were exploited differentially by the Shoshone. A 500 by 500 m grid 
was supermmposed on the study area, and a 10 percent sample was selected from each 
stratum. The resulting 140 grid units (25 ha each) were surveyed, and the locations 
of andividual artifacts and features were plotted on maps; these artifacts and features 
(rather than clusters defined as “‘sites”’) served as the unit of information. Artifact 
and feature distributions and densities derived from the survey data (see Thomas 
1975) were compared with distributions and densities predicted by the simulation 
model. Finally, statistical significance tests (¢.g., chi-square and Mann-Whitney U) 
were used to examine the relationship between expected and observed values. 
Steward’s model was supported by the survey data in that 75 percent of the 
predicted frequencies were verified by the archacological remains. 


Given the stated obye tives, this proyect was a successful predictive modeling 
effort, and the results contributed to existing knowledge because the nature and 
distributions of cultural resources were defined and partially explained. The project 
also employed an innovative survey strategy —the nonsite approach — wherein the 
distributions of artifacts and features across the landscape rather than concentra- 
tions of materials (sites) are monitored. That approach circumvents some of the 
adverse effects that can result from using observed densities of artifacts to distin- 
guish arbitrarily between isolated finds and sites in 2a attempt to understand past 
human behavior. The model is subject to criticism, however, for its heavy reliance 
on the ethnographic record. That approach can only be justified insofar as it can be 
demonstrated that relevant aspects of the ethnographically documented land-use 
systems are consistent with human behavior in the area during the last 4500 years. 


Elk Ridge Project The Elk Ridge Archacological Project: A Test of Random Sampling in 
Archacologual Surveying. Evan 1. DeBloois. Cultural Resources Report No. 2. 
USDA Forest Service, Intermountain Region. 1975 


The Elk Ridge Project was sponsored, in part, by the Forest Service's Inter- 
mountain Regional Office as a feasibility study for determining the validity and 
rehability of random sampling designs in archacological survey. It was carried out 
imitially by Forest Service personnel and subsequently by individuals representing 
Brigham Young University. Its objective was to determine whether a predictive 
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sampling strategy could be designed and implemented as an intemm step in the 
total inventory of a project area. The author was interested in investigating “the 
reliability of sampling in predicting attributes of the larger population,” and 
specifically in addressing the question of “how many sites can be expected in 
such-and-such an area?" (DeBloors 1975:4, 126). This project clearly has 
management-onented objectives, but it also had research objectives as a study of 
the utility of sampling in cultural resource management. An early version of the 
study was a dissertation project at the University of Washington. The information 
summarized here is from DeBloois (1975). 


The study focused on a 133,603 ha area in southeastern Utah comprising 
ponderosa pine, pinon-juniper, oak-serviceberry, and cottonwood vegetation 
zones. Some 640 sites were recorded during the survey of a 44% ha sample of the 
project area. Almost all of the sites were esther habitation or special-use sites 
assigned to the Basketmaker Pueblo sequ*ece. Environmental (¢.g., vegetation, 
soil, and landform types) and cultural (¢.g., site size, type, and cultural affiliation) 
data for the study area were coded using a Topcart digitizer. Various random 
samples of different proportions and quadrat sizes were drawn from the area 
surveyed and used to calculate the total number of sites. Resulting estimates were 
compared with the actual data base and assessed using the chi-square test, Assess- 
ments were made in an attempt to measure the accuracy of different sampling 
techniques and sizes. Simple random sampling was found to be a reliable predictor 
of total population but not necessarily of the distribution of certain site characteris- 
tics. When survey of an “unknown area” was simulated and a random sampling 
scheme was applied, units bet ween 600 and 800 m sq (ca. 36-64 quadrats) were found 
to be most effective. Because of the “dangers of improper stratification of an 
unknown population” it was concluded that simple random sampling might be a 
“better choice” for initial surveys (DeBloots 1975:126). 


The Elk Ridge Project was one of the earliest attempts to apply the concepts of 
sampling and predictive locational modeling to federally mandatec' cultural 
resource management. Given that this project served as a prototype, the relatively 
simple (largely univariate) statistical approaches used cannot be expected to com- 
pare favorably to more recent modeling etforts, with their ngorous use of complex 
multivariate statistics. As is the case with many predictive models generated using 
data bases where known sites represent only the last few thousand years of 
prehistory, one 1s left wondering about the locations of sites representing the 
preceding 10,000 years of prehistory in the Elk Ridge area. 


Lake Koocanusa Project Archacologual Reconnaissance im the Libby Reverror-Late 
Koocanusa Arca, Northwestern Montana. jerry V. jermann and Stephen Aaberg. 
Department of Sociology, Montana State University. 1976 


The Seattle District Corps of Engineers sponsored the Lake Koocanusa recon- 
naissance project, which was carned out by personnel representing Montana State 
University. It was funded because Corps personnel discovered a number of pre- 
viously unrecorded sites in the denuded drawdown zone of Lake Koocanusa in 
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northwestern Montana. The primary objective of the project was to obtain esti- 
matcs for the total number, nature, and distnbutuon of sites that might be present 
in the reservow drawdown zone (Jermann and Aaberg 1976). 


For sampling purposes the 4806 ha, 80 km long study area in the Kootenai River 
Valley was subdivided on the basis of topography. A semes of survey tracts 800 min 
width, and of vanous lengths, were selected randomly from each topographic 
stratum. The tracts represented between 3.6 and 8.2 percent of the exght subdivi- 
sions and totaled about 339 ha or approximately 6 percent of the project area. 
Twenty-one prehistoric sites, identified as spanning the carly Middle Prehistonc 
(1.¢., Archaic) to the Late Prehistomc penods, were documented. Euroamerncan 
sites were also recorded. Site density figures were calculated tor the surveved 
portions of the various topographic subdivisions, and these figures were multrphed 
by the total area in cach stratum to estimate the total number of sites in the project 
area. It was estimated that as many as 400 sites might be present mm the drawdown 
zone. 


T has project 1s an early example of what might be termed the “direct extrapo- 
lation of site density” approach to predictive modeling, or what Kohler and Parker 
(1986) call propection. Although very sample in its approach, this application can be 
considered successful because with this proyection of high site densities the Corps 
was able to justify funding intensive surveys. In an area where the vast majority of 
known sites represent only the lasi few thousand years of occupation and are 
situated in valley bottoms, the detection and prediction of older sites located well 
above the valley bottom 1s recognizably a contribution of information umportant to 
our understanding of local prehistory. 


Wasson Field-Denver Unit CO> Project’ Predicting Site Significance: A Man- 
agement App. ation of High-Resolution Modeling. S. b. James, KR. Knudson, 
A. Kane, and D. Breternitz. Paper presented at the 48th Annual Meeting of the 
Socrety tor Amencan Archacology . 1983; “ Appendix E,” in Well Field Development 
Plam for the Wasson Field-Denver Umt CO) Project Environmental Impact Report. 
W oodw ard-Clyde Consultants. 1978 


The Wasson Field -Denver Umit CO> predictive modeling project was tunded 
by a private oil company as part of its effort te develop an environmental wmpact 
statement (EIS) for a carbon dioxide well-field project in southwestern Colorado. 
The cultural resources portion of the EIS was necessary mm part because the Bureau 
of Land Management required a mght-of-way permit. Personnel representing 
Woodward-Clyde Consultants were responsible for preparing a planning study that 
would improve well-field layout by minimizing impacts to significant archacological 
sites. Information summarized here 1s taken from two draft documents (James ct al 
1983; Woodward-Clyde Consultants 1978). 


The 263,158 ha proyect area comprises plateaus and canyons, agricultural land, 
rangeland, and forests. Environmental and cultural data were entered, compiled, 
analyzed, and displayed using a geographic information system. Map-based intor- 
mation for land use and soil association, prehistone farming areas, topography, 
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roads, archacological sites (sncluding data ov. penod of occupation, size, type, and 
condition), biological communuties, and geologic m2" ~nals was coded and digitized 
for 175,000 cells, each representing ca. 1.5 ha. Swe signunsoce was sdentified as the 
dependent variable and defined im part on the basis of age, type, size, and number of 
components for hundreds of known Basketmaker, Pucbloan Anasaz., and post- 
Anasazi sites. A fundamental aspect of this definition of significance was described as 
the “@shyective attitudes of professional archaeologists” (Woodward-Clyde Con- 
sultants 1978:E-4). The mmvestigators developed a seven-pomt scale that they 
believed conformed to “prevailing opusons of the professional archacological com- 
munity” (James et al. 1983:17). Uleumately, three independent environmental 
variables — soil, drainage rank, and slope — were used in a step-wise multiple regres- 
sion, with the computed site significance values serving as the dependent variable. 
Sets of surveyed cells without sites were also included m the analysis. The analyses 
yielded significance values for each cell, and the scaled values were then color coded 
and plotred on 1:24,000 scale maps. A total of 140 randomly selected cells were field 
inspected ab a means of verifying the model. The model was supported to the extent 
that the “standard error of predicted-to-observed value was identical to the 
standard error of the model” (James et ~ 1983:23). 


This project serves as an example of a management-onented model designed 
to mimumuze uncertainties and delays in the permitting process. It 1s mnovative mm its 
attempt to define significance by relying on the expertise of individuals knowledge- 
able about the most soundant kinds of sites mm the project area, namely those 
considered to have been occupied by Anasazi groups between AD 450 and 1250. 
What might be of concern, at least to archaeologists who specialize mm hunter- 
gatherer studies, 1s thai Archaic sites and Basketmaker II sites were assigned the 
same code for period of occupation. Furthermore, there 1s no othes provision for 
isolating site types th ¥ represent some of the lumited “evidence of seasonal and 
sporadic presence o™F¥oples trom the Paleo-Indian and Archaic periods (10,000, 
BC-AD '30)” (Woods ard-Clyde Consultants 1978:E-8). 


Continental Shelf Project Archaeology and Paleontology. Summary and Analysis of 
Cultural Resource Information on the Continental Shelf from the Bay of Fundy 
to Cape Hatteras, Final Report, Vol. I]. Russell Barber and Michael E. Roberts. 
Institute for Conservation Archacology, Peabody Museum, ps Univer- 
sity. 1979 


Personne! representing the Institute for Conservation Archaeology at the 
Peabody Museum conducted the Continental Shelf Proyect for the Burcau of Land 
Management. The project was designed primarily to provide the BLM with 
information about known or expected prehistoric sites and historically umportant 
shipwrecks and to generate predictions about where specific types of sites will be 
found. Information presented here focuses on the prehistoric sites portion of the 
study by Barber and Roberts (1979). 


Continental shelf, coastal, and nearby low-clevation terrestrial areas between 
Maine and North Carolina constitute this project's 32,388,664 ha study area. 
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Inductive sute-locatsonal models were generated trom the site records tor dryland 
areas simular to areas on the continental shelf. Deductive models were generated for 
the intensity of settlement m a given zone by relying on knowledge and assump- 
tions about human foraging behavior and relevant paleocnvironmental conditions. 
General and specific predictions denved from both kinds of models were combined 
to form a final model. This model was based on the results of a generalized 
assessment of goodness of fit between predictions. The final model presented 
generalized predictions (1.¢., high, medrum, medium-low, very low likelihood) for 
site size, site density, and site type for cach of six 3000-year permods and four 
environmental zones. The tume penods correlate roughly with ditierent s<a levels 
and the resulting changes in the posituons of the coastline. T hese changes ditieren- 
tially affected the distribution and nature of the estuarine, mnland valley, upland, 
and coastal environmental zones m each of the three identified subareas — Maine, 
southern New England, and Mid-Atlantic. The model's end product 1s presented 
on a senes of 1:250,000 scale maps that dlustrate 122 archacology zones, cach of which 
is characterized by tume pernod for predicted site types as well as generalized site 
frequencies and site sizes 


The authors’ clan that the project represents an advance m the state of the art 
of predictive modeling for the nature and distribution of prehistoric sites 1s pustifia- 
ble, although the spatial resolution of prediction 1s low. By combining existing 
sye-file data for some 6600 sites with the theory of optimal foraging strategy and 
including information derived from environmental reconstruction, paleochmatol- 
ogy, and other disciplines, the investigators were able to predict and partially 
explain the distribution of cultural resources. They have provided planners with 
information on the predicted nature and distribution of sites mm a vast area within 
subdivisions as small as SO km?*. At the same time, the approach 1s reasonably 
compatible with contemporary theories, and it affords the opportunity to discover 
previously undocumented kinds of cultural resources (for additional discussion, see 


Chapter 2). 


Fort Benning 4000-Acre Project’ An Archacologuwal Surrey of Selected Areas of the Fort 
Benning Military Reservation, Alabama and Georgia. T. A. Kohler, T. P. Des}eans, C. 
Feiss, and D. E. Thompson. Remote Sensing Analysts. 1980 


Remote Sensing Analysts, a private firm based in Tucker, Georgia, conducted 
the Fort Benning project for the U.S. Army. The scope of work and contract were 
developed and administered by the Heritage, Conservation and Recreation Service, 
Interagency Archeological Services, Atlanta. That agency was responsible for 
selecting the survey tract and specifeng the development of a predictive model to 
serve as an interim management tool. The source of information tor the site 
predictive model summarized here is Kohler et al. (1980). 


Fort Benning, located mm the Fall Line Hills portion of east-central Alabama and 
west-central Georgia, encompasses coniferous and mixed forests, scrub oak and 
brush, and swamp vegetation zones. A judgmentally selected 1619 ha area was 
surveyed, and 31 sites were identified. Of these sites, 10 had historical nonaboriginal 
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components and sia had histomcal aborginal components, three had Mississappuan 
components; three had Late and Maddie Woodland components, exght had Earls 
Woodland and late Archax components, and three had middie and carly Archax 
components. Analysis of vanance, goodness of fit, and r-tests were used to sdemtuty 
soil type, slope, and distance to water as varables that were correlated wath sate 
location. Several soil types, slopes of less than 10 percent, and areas between 75 and 
225 m from water were sdenwfied as favorable site locations throughout the project 
area. These locations were ploited on 1:25,000 scale maps. The locations of pre- 
dicted sute-likelthood strata (maxumum, mtermediate, and least hkely to comtam 
sites) were defined on the basis of the number of mtersecting tavorable stares and 
were plotted on other maps. For example, areas with Cahaba sandy loam soul on 
slopes of less than 10 percent and between 75 and 225 m from water were dentutied 
as part of the maxumum hkelhood stratum, whereas areas with semular souls but on 
steeper slopes and lying more than 225 m from a creek were detined as part of the 
zone least ikely to contain sites. The model also included sute-deassty estrmates tor 
the unsurveved strata, aed st included probability estumates tor encountering a site 
within any given randomly selected area 


The project can be conndered successful m that a readily testable model was 
generated to predict the probability of encountering a site anywhere in the project 
area. It 1s noteworthy thot this project represents an carly and comparatively 
rigorous attempt to use statistical approaches along with new field data generated as 
a result of a systematic surtace and subsurface survey. As m many of the inductive of 
correlative models, most of the site-type mformation, which can be miormative 
about the potential of a site to yeeld mmportant information, 1s lost when the various 
kinds of prehistonc sites are merged to generate a site siteless dichotomy tor 
predictive purposes. Although the concept of site sagnaficance 1s not directly dealt 
with in the model, there 1s an umphcation that maxuumum wre hkelhood zones have 
the highest probabulity of contamung significant cultural resources, especially larger 
residential sites. Other kinds of sites that may have potential to yreld umpertant 
information are hikely to be encountered m the zones that are ‘cast hkely to comtam 
sites, and by imphcation these sites are not as likely to be discovered. For example, 
some types of vegetal procurement sites might be expected to occur on stony sonls 
far from water 


é 


Tombigbee Early Man Project 4 Study of Late Quaternary Laveromment: and Early 
Man Along the Tombighee River, Alabama and Misnwapp, Phaw 1. Guy R. Muto and 


Joel Gunn. Benham-Blair and Affthates. 1980 


The environmental division of Benham-Blair and Athhates, an architecture 
and engineering firm, designed and umplemented the Tombugbee Early Man 
Project. It was tu and partially admunsstered by the Cor)s of Enguneers, but the 
scope of work and project review were primarily the responsibility of the Heritage, 
Conservation and Recreation Service, Washington, D.C, and Interagency Archeo- 
logical Services, Atlanta. The draft report (Muto and Gunn 1980) was the source of 
information summarized here 
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The 77,733 ha project area lies withm the Tombigbee River Valley of eastern 
Mississippi and western Alabama. Forests and agricultural crops cover the alluvial 
terraces, and swamp vegetation occupies the extensive flood basin. The project’s 
major goal was to develop a model that would predict the locations of Paleoindian 
and early Archaic sites. Since most of the Early Man sites were expected to be 
deeply buried in late Pleistocene and or early Holoc-ne deposits, an important 
aspect of the project was the prediction of locations of landforms old enough to 
contain early sites. Toward that end, a generalized “empirical” site-location model 
was developed based on the known distribution and nature of Early Man sites as 
well as on inferred Pleistocene and early Holocene environmental conditions. Using 
the resulting locational criteria (e.g., inside of nver bend, near confluence, near 
wetlands) as predictive variables, the researchers visually scanned topographic 
maps for likely site locations; 620 such locations, termed Quaternary projections, were 
identified. 


A second inductive model was developed using a computer-based “‘prospect- 
ing technique” known as kriging (Muto and Gunn 1980:4-18; see also Chapter 3). 
The kriging model predicted the location of landforms or areas likely to contain 
early sites. Toward that end, during the kriging operation the computer searches 
its data banks for grid units encompassing landforms with environmental character- 
istics like the landforms known to contain sites. The program provides probability 
estimates for the likelihood that a given grid unit may contain the appropriate 
landform. Those grid units predicted to contain sites on the basis of the kriging 
model were termed machine projections. 


Both models were tested by on-the-ground examinations of a sample of the 
Quaternary projection locations and machine projection units. Techniques 
designed to detect buried sites in lowland and swampy environments were used to 
determine site presence and absence at the sampled locations. These techniques 
included the use of soil augers capable of penetrating and recovering several meters 
of clay-rich sediments, which were examined for the presence of artifacts and 
chemically tested to detect paleosols or those deposits with the potential of 
containing cultural materials. A total of 56 Quaternary and machine projections 
were selected and tested for the presence of cultural materials. Of those, 34 locations 
were selected using a proportional stratified random sampling scheme. Strata were 
defined as combinations of locational criteria. For example, one stratum included 
only locations near stream confluences and wetlands while another stratum 
included locations with the same criteria plus being on the inside of a river bend. 
The other 22 locations were selected on a judgmental basis for on-the-ground 
testing because they exhibited unique environmental characteristics or because 
they filled spatial gaps in the random sample. 


The overall approach achieved some success in that slightly more than half of 
the randomly selected Quaternary projections yielded cultural materials. This 
success rate is actually quite high given that few of the sites would have been 
detected by examination of surface or near-surface deposits. An important contri- 
bution of this project to predictive modeling is its extensive use of paleoenviron- 
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mental data and its attention to depositional processes that alter the shape of the 
landscape and bury archaeological sites. Although more than 600 locations were 
identified and almost 10 percent were tested, the resulting data were not employed 
to calculate probability estimates for detecting a site at any given location or within 
any stratum. The greatest problems encountered in summarizing this project from 
the information presented in the draft report were that the detailed discussions of 
some of the proyect methods were difficult to understand, and the relationships 
between these methods and the results of the project were not always clear. In 
addition, the lack of a detailed discussion about the derivation and use of the kriging 
model was disappointing. These organizational problems may be a result of the 
draft status of the project report and could be resolved as part of the editorial 
process. 


NPR-Alaska Project ‘“‘Remote Sensing in the NPR-A Cultural Resources Assess- 
ment.” James I. Ebert and Galen N. Brown. In Anthropological and Historic 
Preservation Cooperative Park Studies Umit Occasional Paper No. 25, pp. 349-419. 
University of Alaska. 1981 


The National Petroleum Reserve-Alaska Project was sponsored by the 
National Park Service, Washington, D.C., and implemented largely by personnel 
representing the Anthropology and Historic Preservation Cooperative Park Studies 
Unit of the University of Alaska. Remote sensing components of the project were 
carried out by personnel from the National Park Service’s Remote Sensing Division, 
Southwest Cultural Resources Center, Albuquerque. The objective was to use a 
remote sensing approach to correlate environmental settings with known site 
locations in an effort to increase the accuracy and cost efficiency of the cultural 
resource assessment of the 9 million ha project area. The report on the remote 
sensing aspects (Ebert and Brown 1981) provided the information summarized here 
(see also Chapter 9). 


Moist tundra, wet tundra, alpine tundra, high brush, and waterways consti- 
tute the basic ecosystems in north-central Alaska, where this proyect was located. 
The area includes portions of the Brooks Range, Arctic Foothills, and Arctic Coastal 
Plain physiographic provinces. 


Landsat and high-altitude color infrared imagery data were used to define six 
ecologic cover types and six transitional types. These 12 strata and all previously 
recorded archaeological sites were plotted on 1:250,000 scale maps. Area measure- 
ments were made for each stratum, and the amount of land that had been surveyed 
within the various strata was calculated. Next, cultural, landform, and ecologic 
cover-type data were recorded and correlated as a means of characterizing the 
occurrence or nonoccurrence of site-specific cultural and landform data in each 
stratum. For predictive purposes, the observed site density in the surveyed por- 
tions of each stratum could be multiplied by the area of any unsurveyed parcel (in 
the same stratum) to determine a site-frequency estimate for that part of the 
project area. Using similar extrapolation techniques, the project personnel gener- 


615 








THOMS 


616 


ated sample data to estimate the relative frequencies of site types and the expected 
content in unsurveyed areas. 


The model is appealing largely because of its simplicity and probable cost- 
effectiveness as a first-step approximation of the nature and distribution of cultural 
resources in a vast area. It provides an idea of the number, content, size, and other 
characteristics of sites that might be expected in an unsurveyed area—information 
critical to realistic estimates of the time and money required to conduct on-the- 
ground surveys. As recognized by the authors of the report, however, the predic- 
tions are conditioned by the quality of the data, which varied from survey to survey. 
Furthermore, the approach is unlikely to be particularly useful in predicting the 
presence of theoretically expected but as yet undocumented kinds of sites because 
it relies entirely on information about previously discovered site types. It is also 
apparent that the model would not be of great use in gaining information about past 
environmental or cultural conditions or about why cultural materials are distrb- 
uted across the landscape in particular patterns —limitations also recognized by the 
authors. 


Seep Ridge Project Archaeological Inventory in the Seep Ridge Cultural Study Tract. 
Signa L. Larralde and Susan M. Chandler. Nickens and Associates. 1981! 


The Bureau of Land Management funded the Seep Ridge Project, which was 
carried out by personnel from Nickens and Associates, a private archacological 
consulting firm in Montrose, Colorado. Objectives of the part of the project with 
which this summary 1s concerned were (a) to derive a formula that would determine 
the probability of site occurrence at any point in the project area, and (4) to 
delineate for management purposes areas suspected to contain an extremely low 
density of sites. The authors noted the possibility that “‘project-by-proyect cultural 
resources clearances may not be necessary” in some portions of these extremely low 
density areas (Larralde and Chandler 1981:1). 


Semuarid canyons, ridges, eroded buttes, and dune fields are characteristic of 
the 44,292 ha project area, as are juniper, sagebrush, grasslands, and some desert 
riparian vegetation. The BLM used a 10 percent nonstratified, systematic random 
sampling scheme to preselect 274 16 ha tracts for survey. Within that area, 40 sites 
and 106 isolated finds were recorded; these remains represent all major occupations 
of the area, from Paleoindian to Euroamerican. A discriminant function analysis was 
used to compare the relationships between site and nonsite locations on the basis of 
environmental attributes— presence/absence of sand dunes, viewspread, distance 
to vantage points, distance to juniper forest, and a measure comparing site or 
nonsite vegetation with surrounding vegetation. High, medium, and low sensitiv- 
ity zones were delimited, primarily on the basis of positive correlation between high 
density and increasing proximity to juniper trees and sand dunes. 


The discriminant equation used in this project is described as a “powert! 
management tool” because it requires data from only six variables and be 
values for these variables can be measured for any point on a USGS topographic 
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map. When values for these variables are “plugged snto”’ the formula, the result 1s a 
probability estimate for site occurrence. The authors suggest that “‘if the probabil- 
ity of site presence 1s low, archaeological clearance could be granted without the 
necessity of a field check. If, however, the probability of site presence 1s near the 50 
percent range, a field inventory would be in order” (Larralde and Chandler 
1981:136). The authors stress that their results are intended as an example of the 
power of the technique and that this particular equation should not be used in the 
planning unit “until the function is strengthened by the inclusion of more data” 
(Larralde and Chandler 1981-136). 


Given the 97.1 percent “accuracy rate” claimed for one version of the discrimi- 
nant analysis, which classified only one of the 34 sites as a nonsite (Larralde and 
Chandler 1981:133), the modeling project appears to have successfully achieved its 
stated objective. There are, however, several potential problems with the model, 
two of which are noted here. The first problem is that the preselected one-half by 
one-eighth mile transects do not represent a random sample of the landscape in the 
project area because the central portion of each quarter-section had no chance of 
being selected (Berry 1984). The linear transects were “situated in quarter sections 
so that cadastral monuments could be used to maximize location control. . . . Each 
sample unit was systematically placed in its quarter section to extend from section 
corner to quarter corner” (Larralde and Chandler 1981:4). 


The second potential problem concerns the equation of zones of low site 
density with nonsignificance, that is, with areas that merit no further attention. 
Because of this equation there is no opportunity to determine whether scientifically 
important cultural resources are present in the low-density zone. It 1s clearly 
possible that low site-density zones were occupied at some point in the past when 
environmental conditions were different and human population was low. Given the 
procedures summarized above, there would be little chance that old and rare sites 
would be discovered. 


Okanogan Highlands Project 4 Cultural Resources Predictive Land Use Model for the 
Okanogan Highlonds. R. R. Mierendorf, T. K. Eller, D. Carlevatoegai P. A. 
McLeod. Cultural Resources Group Report No. 100-2. Eastern Washington 
University. 1981 


The Bonneville Power Administration, Portland, Oregon, funded the Okano- 
gan Highlands overview predictive modeling project for an area in north-central 
Washington. The project was designed to evaluate possible disturbances to 
archaeological sites along proposed transmission lines, and it was implemented by 
the Bonneville Cuitural Resources Group, Eastern Washington University. This 
summary, based on Mierendorf et al. (1981), focuses on the prehistoric and ethno- 
graphic aspects of the land-use model. 


Low, forest-covered mountains and steep-walled valleys with steppe vegeta- 
tion are characteristic of the 2,166,200 ha study area. Existing site-file data were 
available for 459 sites representing all major periods of occupation (Paleoindian 
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through early historical). A predictive model of prehistoric land use was developed 
based on the seasonal and spatial distribution of resources and on ethnographically 
documented Native American settlement patterns and subsistence practices. In the 
report the model 1s presented as a series of maps that delimit seasonal activity areas 
and expected site densities (high to low); the latter are based on known site 
densities in similar areas. Examples of zones delimited on maps include “winter 
residence areas with moderate site density” and “summer hunting and gathering 
areas with the lowest site density.” A sensitivity analysis was conducted to evaluate 
construction impacts; it used the predictive model to assess potential site signifi- 
cance according to numeric values assigned for regional research significance, site 
density, and known impacts to cultural resources. Six sensitivity zones, which 
correspond to generalized geographic strata, were plotted on maps. 


The model provides considerable information about the general location of 
different kinds of sites but almost no information about the probability of encoun- 
tering a site at any specific location. Even so, it permitted mitial estimation of 
possible disturbance to sites that would result from construction of a powerline 
across the proyect area. The authors note that important sites could occur in the one 
“low site density low sensitivity zone” and in two of the low density moderate 
sensitivity zones that they have defined, but they consider the probability of 
encountering such a site along a powerline to be low. They expect that “future 
surveys [in the low density low sensitivity zone] will locate sites that are regionally 
important” (Mierendorf et al. 1981:117). 


Reliance on the ethnographic record to predict prehistoric land-use patterns 
considerably reduces the generalizing power of the model. The authors recognize 
one aspect of this problem when they suggest that changing resource distributions 
might have caused changes in the location of activities. What they do not seem to 
recognize is the probability that at times im the past, especially when human 
population densities were much lower than those of the ethnographic present, it 1s 
likely that different land-use systems operated. For example, one would expect 
different distributional patterns for different site types depending on whether 
people spend the winter near stored foods or depend on frequent moves among 
areas where food resources are available. In the latter case, winter village sites might 
not be located in the riverine zone, and fishing sites might not be nearly as common 
as they were during the ethnographic period. If the prehistoric winter pattern was 
one of frequent residential moves, a number of small, short-duration residential 
sites might be located at some distance from the river. Overly heavy reliance on the 
ethnographic record in developing predictive models could result in cultural 
resources representative of avery different land-use system remaining undetected. 


Salmon River History Project An Orerriew History in the Drainage Basin of the Middle 
Fork of the Salmon River. Mary P. Rossillon. Cultural Resources Report No. 6. 
USDA Forest Service, Intermountain Region. 1981 


Historical research conducted for the Salmon River project was done by 
personnel representing Washington State University and the University of Idaho. 




















SURVEY OF PREDICTIVE LOCATIONAL MODELS 


The study was sponsored as a joint venture involving these universities, the Idaho 
State Historical Society, the Forest Service, and the Idaho State Historic Preserva- 
tion Office (Knudson et al. 1982). Early in the project the researchers recognized 
that little information was available about the stockmen’s culture in the central 
Idaho area. As one means of acquiring that information, a model was developed to 
predict the locations of nineteenth-century stockraising-associated sites. Informa- 
tion summarized here was taken from Rossillon (1981). 


Mountains and upland valleys are characteristic of the 320,000 ha study area. 
Coniferous forest, some of which is relatively open, is the dominant vegetation 
zone, followed by grasslands and meadows. The entire study area was subdivided 
into 3 by 3 km grid units, and each unit was characterized according to its distance 
from a local market, the palatability of summer and winter range for cattle and 
sheep (a calculation based «x the percentage of readily accessible fodder), and 
expected hay production ( base«d on the number of cattle and sheep that could be 
supported). High-use areas—those with the greatest potential for grazing and hay 
production and those with the longest growing seasons — were located and mapped. 
Winter cattle and sheep grazing areas were predicted to be associated with perma- 
nent log structures (ranch headquarters), and summer grazing sites (temporary 
camps) were predicted to be associated with limited scatters of historical artifacts 
and perhaps with less-permanent structures (¢.g., simple corrals). 


The model provides insight into the probable distribution of sites created by 
stockraising activities, and it provides a framework for assessing the significance of 
such sites. Although its spatial resolution ts low, it does provide a way of estimating 
site presence for every 900 ha area, and it illustrates that the sites tend to be near 
creeks. It could be argued, with some justification, that the model is overly 
simplistic. This project should be recognized, however, as one of the carhest 
attempts to deal with Euroamerican ranch sites as a resource of concern to cultural 
resource managers and as a potential data base for acquiring important information 
about regional history. Viewed from that perspective, the model was successful. 
This model and the one developed by Hackenberger (1984; see below) have a similar 
procedural logic, and both were an outgrowth of a Forest Service reconnaissance 
predictive modeling project (Knudson et al. 1982). 


Bisti-Star Lake Project Archacologwal Variability within the Bisti-Star Lake Region. 
Meade F. Kemrer, editor. ESCA-Tech. 1982 


Archacological investigations for the Bisti-Star Lake Project were funded by 
the Bureau of Land Management and carried out by personnel representing the 
Albuquerque office of ESCA-Tech, an environmental consulting firm. The model- 
ing objectives for the project were to develop and refine methods capable of 
predicting the presence of sites with specific cultural and temporal characteristics. 
That information would then be used to generate formal predictions concerning the 
density of sites of various types throughout the proyect area (Kemrer 1982). 
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Sagebrush, rabbitbrush, greasewood, and other semiarid vegetation 1s charac- 
teristic of the dissected plateaus in the 31,413 ha project area, which hes within the 
San Juan Basin of New Mexico. Landsat data were generated and coded for the 
project area in 2 by 2 km grid units (400 ha each). Seventy-two environmental 
variables, consisting of different combinations of eight environmental classes (e.g., 
Avalon-Sheppard-Shiprock soil association and major washes), were derived from 
Landsat data; one data set (presence absence of variable states) contained all the 
umigue two-way interactions between environmental classes. The archaeological 
data base for the initial model consisted of existing site-file data from surveyed areas 
within and adjacent to the project area. Site type and content data as well as 
information on cultural/temporal affiliation were examined for more than 450 
components. Eight site classes were developed using analysis of variance tech- 
niques. A backward step-wise multiple regression was used to reduce the number of 
environmental variables, and other linear equations were used for modeling site 
component densities. Projected site densities for the 2 by 2 km grid units were 
plotted on maps. 


The project area was then subdivided into a number of leases, and a sample 
totaling about 4600 ha (ca. 15 percent of the total project area) was yudgmentally 
selected and surveyed. Choice of parcels to be included in the judgmental sample 
was based, in part, on land ownership, size of sample units, and predicted cultural 
resource variability. A total of 92 sites and 213 isolated finds were documented. Some 
Paleoindian and Archaic sites were found (11 of 319 components), but most remains 
were classified 2° Anasazi, Navajo, historical, or lithic scatter sites. Resulting (ata 
were added to the existing site-file data base as a means of testing and refining the 
initial model. A regression analysis approach was again used to produce the refined 
model. When the augmented cultural resource data base was analyzed with 34 
environmental variables, figures showing the percentage of explained variance were 
generated for each of the site types. Mean site-frequency predictions were gener- 
ated for more than 800 grid units and plotted on eight maps, one for each of the 
following site types: lithic sites, Anasazi sites, pre-1933 Navajo sites, post-1933 
Navajo sites, total Navajo sites, and total sites. 


The overall modeling approach yielded information on the range of variability 
in cultural temporal components, site types, and site densities. The means by 
which this was accomplished and the overall reliability of the results are not always 
obvious. Muc? of the discussion on model development is difficult to comprehend, 
and decisions about selection of areas for survey were highly judgmental. The 
project area, the area from which the environmental data were extracted, and the 
survey area were all different, and the size of survey units differed from subarea to 
subarea. These factors may have affected the results of the statistical analysis. 


There are also potential problems with the manner in which field information 
was gathered and analyzed. These problems make it difficult to replicate the overall 
approach and may well have caused the model to yield arbitrary results. Isolated 
finds, for example, were excluded from site density estimates. Unfortunately the 
criteria used to distinguish isolated finds from sites were not mgorous. In fact, 




















SURVEY OF PREDICTIVE LOCATIONAL MODELS 


considerable overlap 1s likely given that different survey teams and different 
individuals operationalized the site and isolated find definitions: 


Sues were diflerentiated from ssolated cultural occurrences on the basis of information 
potential. A site was defined as a locus manifesting the outcomes of past human behavior 
which contained more identifiable or potential scentific data values than could be 
effectively extracted at the tume of survey. Isolated occurrences were defined as those 
cultural mansfestations whose scentific data values could be adequately documented by 
the survey [Cella 1982.75]. 


Another factor that might have led to arbitrary results has to do with the 
manner in which sites were classified as to type. The most obvious case is the 
merging of identified Paleoindian and Archaic components with unidentified lithic 
components to create a single type. That procedure probably masks a significant 
portion of the observed cultural temporal and site type variability, yet detection of 
that variability was one of the major goals of the project. 


Ozark-St. Francis National Forests Project 4 Cultural Resources Overview of the 
Ozark-St. Francis National Forests, Arkamas. George Sabo Ill, B. Waddell, and J. H. 
House. Arkansas Archacological Survey. 1982 


Arkansas Archacological Survey personnel conducted this overview project in 
the Ozark -St. Francis National Forests for the Forest Service (Sabo et al. 1982). The 
principal objectives were to assess the potential nature and distribution of prehis- 
toric and historical sites in unsurveyed areas and to provide predictions concerning 
the nature and distribution of cultural resourses. This information was to be 
incorporated into multiple resource management plans. 


The 461,000 ha project area encompasses two national forests in northwestern 
and east-central Arkansas. As a means of generating expectations for the nature and 
distribution of cultural resources, a series of deductive adaptational models were 
developed. Four temporal periods were defined jointly by adaptation type and 
paleoenvironmental type: Late Pleistocene Early Holocene hunting and gather- 
ing; Middle Holocene hunting and gathering; Late Holocene (post-Hypsithermal) 
hunting, gathering, and plant husbandry; and Late Holocene horticultural, hunt- 
ing, and gathering. Initial narrative predictions were made concerning the distribu- 
tion, content, and types of sites within each of four major environmental zones: 
river bottomland, upland slopes, bluff lines, and upland plateaus. A simular 
approach was used to define seven major and seven supplementary historical 
adaptation-type models. Examples of these ethnohistorically and historically 
recorded types include Osage ( AD ? - 1804), Creek (1794-1828), Spanish (1673-1803), 
pioneer hunter/herder (1803-ca. 1840), Civil War (1860-1875), resorts (ca. 
1860- present), and Forest Service (1908-present). 


Biophy sical data, including elevation, soil types, topographic settings, phy sio- 
graphic subdivision, and vegetation types, were coded for 259 known sites that 
could be plotted reliably on USGS quadrangles. Q-mode cluster analyses were 
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performed separately on prehistoric and histoncal sites. Univanate and bivariate 
statistical procedures were used to determine which variables correlated best with 
site locations. The mmportant variables were topographic setting, soil capability, 
distance to water, and cievation. The resulting inductive models yicided four 
clusters. These were qualitatively compared with expectations denved from the 
adaptation-type models. It was concluded that the inductive, Q-mode analyses 
generally supported the deductive models. Site likelihood zones based on topo- 
graphic setting for historical and prehistonc sites were plotted on maps, and 
generalized site-density-potential values (high to low) were assigned to cach zone. 


As 1s the case with most deductive predictive modeling approaches, the end 
result of this project provides only limited spatial resolution for the predictions. In 
this case, most of the zones comprise thousands of hectares, and available data do 
not permit a finer resolution of density and or potential anywhere within a given 
zone. Furthermore, this kind of model 1s difficult to falsify, largely because of its low 
spatial resolution and generalized treatment of site content data. It does, however, 
meet its objective in that predictions are made for the potential nature and 
distribution of cultural resources. The approach also allows for, and in fact encour- 
ages, the discovery of site types that are undocumented but theoretically expected 
in the study area. Examples include most of the Pleistocene site types and types 
representative of seventeenth-and eighteenth-century adaptations. Furthermore, 
the issue of site significance is divorced from the concept of site likelihood 
zones: the authors note that “significance must be determined on a case-by-case 
basis . . ., and a site in any likelihood zone could easily turn out to be highly 
significant” (Sabo et al. 1982:188). 


Passaic River Basin Project A Preliminary Cultural Resource Sensitivity Analysis for the 
Proposed Flood Control F acilities Comstruction in the Passaw River Baun of New Jersey. 
Robert Hasenstab. Soil Systems, Inc. 1983 


The New York District Corps of Engineers funded the Passaic River Project; 
Robert Hasenstab (University of Massachusetts, Amherst) implemented the pro- 
ject through a subcontract with Soils Systems, Inc., an environmental consulting 
firm based in Marietta, Georgia. The project's objectives were to estimate the 
quantities of cultural materials likely to be affected by proposed flood-control 
facilities and to define areas with a high probability of site occurrence (Hasenstab 
1983). 


The 1619 ha project area extends 160 linear km along the Passiac River, cross- 
cutting ndge and valley, piedmont, coastal plain, and tidal/ estuarine areas, Urban 
and commercial developments occupy most of the impact zone, but 42 percent 1s 
either agricultural, forested, or classified as wetlands. The project area was subdi- 
vided nto a high-resolution grid of 0.47 ha units (pixels) for which various environ- 
mental variables were coded; all manipulation and mapping utilized a GIS. Univar- 
jate statistical tests were employed to determine which environmental var:ables 
were most u veful for their power to “retrodict” known site locations. Significant 
variables were found to be soil drainage, distance to nearest river, distance to minor 
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tnbutary confluence, and distance to a major tributary nvez conflusace. Gnd cells 
were assigned a sensitivity rating by summarizing the vanous cultural component- 
vanable ratings. The sensitivity models were then tested and revised using data 
derived from a survey of 300 pixels (ca. 140 ha) representing a stratified random 
sample of the project area (with some modifications). Overall, the sample fraction 
was about 6.5 percent of the umpact zone. The survey techniques included lhmued 
but systematic subsurface testing within yudgmentally selected pixels. Twenty- 
eght historical sites and 16 prehistonc sites were recorded. A senes of computer- 
generated maps illustrated the final model on a pixel-by-pixel basis mm terms of 
prehistoric archacological sensitivity (high, medium, or low, based on the cultural 
component-vanable ratings) and a combination of histoncal and prehistoric 
sensitivity. 


The author concludes that the GIS approach “has greatly enhanced the 
capabilities for archaeological prediction and land-use management, . . . [but i] 
cannot be taken as a final solution to all cultural resource management problems” 
(Hasenstab 1983:13). The land managers did learn something new about the 
distribution of sites, but not much about their nature. Hasenstab’s ( 1983:1-n, 14-16) 
self-critique warrants close attention, since the problems he identifies are shared by 
many models: (4) the grid resolution may have been too coarse to detect important 
variables (such as small sandy knolls), (4) no attempt was made to deal with 
problems of spatial autocorrelation, (<) no consideration was given to understanding 
the effects of different variables on different site types, and (4) the fieldwork was 
prebably not of sufficient scope to assess the model adequately. The approach 1s also 
problematic because it sumps together all prehistoric sites and thus tends to obscure 
the variability that 1s represented by thousands of years of human occupation. 


Like some of the other models discussed here, this one also equates high 
likelihood zones with a high potential for the occurrence of significant sites. 
Furthermore, it equates low sensitivity with nonsignificance and with a lack of 
necessity for legal protection. This 1s demonstrated in the following statements 
from a subsection of the report entitled “Synthesis of Cultural Resources 
Sensitivity”: 


Finally, 20 percent of the proyect area could be “written-off” leguumately. The low 
histor low prehistorse sensitivity stratum (10 percent of the proyect area) would yield a 
very low return on encountered cultural resources. The medium histor low prehws- 
tore sensitivity stratum (10 percent), as mentioned above, could be sacrificed, as a 
substantial portion of the medium sensitivity stratum will already have been sampled 
|Hasenstab 1983:134) 


Such a conclusion does not seem compatible with a preliminary cultural resource 
sensitivity analysis, which the title of the report indicates that this was intended to 
be. Neither does it seem to be compatible with the author's recognition that the 
sample survey may not have been of sifficient scope to permit adequate assessment 
of the model. 
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Grand Junction Resource Area Project A Manual for Predoctiwe Sue Location 
Models. Kenneth L. Kvamme. Draft report submited to the Bureau of Land 
Management, Grand Junctson District. 1983 


The BLM funded the Grand Junction Resource Arca Project as an overview of 


statistical classificatvon procedures for predsctung archacologcal sate locatsons. This 


summary emphasizes aspects of the project related to the development and testing 
of models mn the Grand Junction Resource Arca. For that area, the objective was to 


develop quantutative models that could be used to predict bkely locations of 
prehustorsc sites (Kvamme 1983; see also Chapters 7, 8, and 10). 


The project area encompasses some 438,996 ha of western Colorado uplands. 
Vegetation types characteristic of the area mnclude desert grasslands as well as 
pinon-junaper woodlands and spruce-fir forests. The subareas of the district were 
stratsied into five major biotx communities conmdered to occur m significant 
proportions across the landscape. A stratified proportional random sample of 65 ha 
quadrats (quarter sections) was selected from the physographically defined sub- 
areas. One hundred quadrats were selected for survey, specifically to provide the 
data base for generating the models. The surveyed area amounted to about 1% 
percent of the proyect area. Environmental data were coded for ste and nonsite 
locations. Through a senes of statistical analyses, the following variables were found 
to be umportant m distinguishing between site and nonste locations: biotic zone, 
vertical distance to permanent water, vantage pomt distance, slope, view, expo- 
sure, shelter within 100 m, and shelter within 250 m. The models were developed 
through a pattern-recognition approach using various multivariate analyses as 
classification tools, the most successful of which was logistic regressson. Depending 
upon the particular approach used, GIS-based probability surface maps were 
generated to provide dustrate predictions for sites and stcless loci wn unsurveyed 
areas covering from 0.6 to | to 25 ha. The accuracy of the various models was tested 


independently using site-file and nonsite data, as well as split sampling techmques. 


Kvamme's approach to predictive locational modeling ws statistically and 
computationally more sophisticated than that exhibited by other proyects summa- 
rized here. The report is clearly an important contribution m that mt provides a 
thorough overview and many examples of a wide varnety of statistical approaches to 
developing and testing inductive, or correlative, models. The project did not, 
however, achieve the goal stated by the author, namely “to model the locations of 
all sites, regardless of type, because all sites are of potential mterest to Cultural 
Resource Management” (Kvamme 198309). 


Thas suggestion that the Grand Junction Resource Area report failed to model 
the location of all sites us based on three observations. The first concerns the 
apparent paucity of sites in 38 percent of the project area. Kvamme suggests that 
the low density of sites (four were known) mn the high-clevation community (which 
comprises 15 percent of the resource area) us a result, im part, of “the dense 
vegetational cover occurring at high elevations which inhibited site discovery” 
(198362). At the same time, only a few sites (26) occur m the desert community, 
which represents about 24 percent of the resource area (no explanation 1s offered for 
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thes low density ). Since 85 percent of the sites are un the pinon-juniper Community, 
whach constitutes only 62 percent of the project area, « was determined that 


Because of the paucity of etes om all but the peevon-pumeper communaes, a will not be 


Ppossibie to make meamungtul comparnans of ste lac atson patterning bet ween Commum- 
tees on the analyses that follow although ths was ormgunally amended |[Avamme 198362) 


Because two major zones with very different resource potentials for abornginal 
hunters and gatherers were eflectively excluded, a seems likely that potentually 
mportant site types were not modeled accurately. 


Another aspect of this research that hampered modeling of all sate locations 
was the excluson of rockshelters from the analyses owing to the assumption that 
“thew locatsons cannot be predicted because of the sdhosyncratic geological proc- 
esses that regulate thew presence” (Kvamme 1983268). Although identified rock- 
shelters represent only 2.5 percent of the recorded sites, they have conssderabic 
potential to yield umportant mmformation. 


A tinal pout concerns the arbitrary distinction drown bei ween sates (10 oF 
more artefacts mm a 20 m diameter area) and woiated occurrences (fewer than 10 
artifacts om am area of the same suze). 


In order to make the analyser of ste locational pattermeng more manageable and also to 
reduce the sdrosy ncratu locational vanation undowbtedly ecxbsbaed by wolsted occur- 
rences of artefacts (om many mmetances), only “concentratsoms” of artefacts were recorded 
as wtes and analyzed here [Kwamme 198367) 


Many archacologists might argue that sites are often represented by fewer than 10 
pieces of pottery or chipped, ground, or battered stone. Another potentially 
important site type —small, low artifact density — was therefore excluded from the 
model. 


Kaibab and Cuba Study Area Projects Theory and Model Building: Defomng Surrey 
Strategies for Locating Prebastorn Heritage Resources. Landa §. Cordell and Dee F. 
Green, editors. Cultural Resources Document No. 3. Forest Service, South- 
western Regional Office. 1983 


The Kasbab and Cuba study areas are part of a proyect sponsored by the Forest 
Service as a collaborative effort among archacologists from academic and federal 
communities (Cordell and Green 1983). Specifically, participants im the endeavor 
were asked to formulate trial predictive models that could be refined and tested. 
The informatio: summarized here is from two model-building articles, one about 
the Tusayan Distnct in the Kaibab National Forest (Plog 1983a) and one about the 
Cuba District m the Santa Fe National Forest (Plog 1983b). 


Study Area |, the Tusayan Ranger District of the Kaibab National Forest, 1s 
located om northern Arizona. It is on an upland plateau that « dissected by 
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mtermuttent streams and coveted by pinon-juasper, ponderosa, and sagebrush 
vegetation communities. Known sites in the area are the results of Archasc, Anasazi, 
and Cohonma occupations. The objectives for the Kasbab study area were to use 
previously denved iniormation from a | percent sample survey (cesmgned for 
planning purposes) to make predictions about site densities across the landscape 
and to “test™ the predictions by compznng them wath the results of intensive 
surveys conducted m nearby areas. 


For the sample survey the 4858 ha study area was divided into zones based on 
drammage basins and vegetation types. Using the results of the | percent sample 
survey, the researcher estumated site densities for the various zones. The estumated 
densities were found to differ considerably from the observed densities im nearby 
intensively surveyed areas. The differences were pudged to be the result of the 
nonquantitative fashion in whach the estumated density figures were generated 
¢.g-, there was no rationale for dividing the area mto dramage basins, 2nd zones 
without sample data were assigned zero density values). In the case of tis tral 
formulation, was concluded that “had SYMAP or some other spatial smoothiag 
program been emploved, a successful predictive model might have been gener- 
ated™ (Plog 1983a:66) 


The Cuba District study area (Study Area 4) us a 3427 ha block unit m the 
forested upland zone of north-central New Mexico. In thus case the objective was to 
examune the feasibility of dong predictive modeling by drawing upon the results of 
intensive surveys of the block area. The study area was surveyed m part by a Forest 
Service crew and in part by acontractor’s crew. A total of 142 sites, all dating to the 
Gallena phase (AD 1150-1250), were documented. These included sites with surtace 
structures, pithouses, towers, and check dams. An analysis of the survey data 
revealed that 9 percent of the sites were located on ndge tops, while this topograph- 
« feature constituted only 23 percent of the survey zrea. Even though few sites were 
found on the valley floors (and all of these were found by a single crew), it was 
recogmzed that these sites could potentially provide “wmportant and umague 
evidence™ about the area. 


The researcher concluded, theretore, that of surveys m this study area were 
focused on the valley floors and ndge tops, coverage could be limited to 38 percent 
of the study area and almost al! the sites would still be discovered. It was also argued 
that once a number of valley floors had been surveyed st should soon be possible to 
distinguish the characteristics of those valley floor ecosystems that would have 
associated sites from those that would not have sites. The author concludes his 
study by stating that “data from this study area result om as clear a definition of an 
approach for finding all sites with less than inventory survey as one can mmagine™ 


(Plog 1983b:78). 


Both tral formulations of predictive models are presented in a bref and smple 
tashson. The models are mapped to slustrate the locations of high site-density zones 
within the outlines of the study areas. The lack of background information about 
these study area projects makes ut difficult to understand how data were gathered. 
Much of the information necessary to compare this approach with others 1 not 
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readily obtamabie trom the report. It appears that the underlying purpose of this 
project was to determine whether or not portions of the areas could be exempted 
trom on-the-ground survey by relying on the results of previous surveys m semular 
environmental settings. In the case of the Kasbab area, tor example, st 1s argued that 
had the appropnate “spatial smoothing program™ been used, “the predictive 
model generated mm the planning document would have allowed a no-surves 
decmmon to be made™ (Plog 19832265 


What remains unexamined m these trial formulations us the rehabulity of the 
exrsting survey data. The problems attributed to a “nonquantitatwe™ approach m 
the Kaibab study area can be alternatively explained by arguing that different 
people conducted the surveys tor difierent reasons and that, consequently, the 
results are hkely to be diferent. The uncritical acceptance of the survey results m 
the Cuba area — w hoch indicated that throughout prehistory the area was mhabued 
only tor a 100-vear penod, between AD 1150 and 1250—1s also questionable. Could 
past crosonal conditions have filled the valley floors so that only relativeéy recent 
sediments are exposed, thus masking evidence that the areca was also used or 
occupied by other groups of people? Is possible that ground cover obscured all but 
the most obvious (1.¢., architectural) cultural features? The mformation that one 
survey team found all the recorded valley floor sites indicates the potential tor 
problems im data rehabulity; other things being equal, ove mught logically conclude 
that different survey methods were used. What may be needed here 1s not merely 
retinement and testing of trial formulations, but a reformulation of the approach to 
predxtive modeling, one that recognizes the complex variateon mberent im the 
archacological record 


Fort Benning 2200-Acre Survey Project’ 4” Intemere Surtey of 4 2,200 fore Tract 
vithen 4 Propeed Maneurer Area at the Fort Beammg Military Rewrration. P.M 
Thomas, Ir., L. |. Campbell, M. T. Swanson, |. H. Altschul, and C. $. Weed 
Report of Investigations No. 71. New Word Research, Inc. 1983 


The Department of Defense (U.S. Army Infantry Center and Fort Benning 
Military Reservation ) tunded this proyect, the second proyect carned out withen the 
contines of Fort Benning to be summarized in this appendix. This study was 
admumstered by the Archeological Service Branch, Div: son of National Register 
Programs, Natsonal Park Service, Southeast Region (Atlanta) and carmed out by 
personnel representing New World Research, an archacological consulting firm 
based in Pollack, Loumsana. This project was designed to conduc’ an mtensive 
survey and to test and retine a predictive model developed tor the area (ree vears 
eather by another consulting firm (Kohler et al. 1980; see above). Intormation 
presented om this summary ts trom Thomas et al. (1983 


Pine torests, oak and oak hickory uplands, bottomland hardwoods, wooded 
swamps, and mixed pine hardwood forests are characteristic of the 8907 ha proposed 
maneuver area that was the focus of thes project. A block of land amounting t. about 
10 percent of the project area (ca. 891 ha) was preselected and surveyed to provide a 
data base tor evaluation of the larger maneuver area and tor the testing of the 
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existing predictive model. Thirty-seven sites were identified: 20 prehistoric, 15 
historical, and 2 with both prehistoric and historical components. Site locations 
were assessed according to a predictive model based on soil type, slope, and distance 
to water, which was developed by Kohler et al. (1980). The model was found to be 
basically sound but in need of some refinements, including a more accurate mappn_g 
of the distribution of soil types. In an effort to refine the model and determine which 
variables best explained the observed variation, a discriminant analysis was under- 
taken. Data for 10 environmental variables, including some information from a 
hypothetical catchment area with a 225 m radius, were coded at the 37 site locations 
and at 40 siteless locations. Ultimately, combinations of the variables were identified 
that could be used to define very high, high, low, and very low probabilities for 
encountering prehistoric and/or historical sites at any given location. A second 
discriminant analysis was performed on a data set from other portions of the project 
area; this data set consisted of 207 known sites and siteless points, including the 77 
cases from the surveyed area. The discriminant analysis successfully reclassified 
more than 96 percent of the cases. 


The project achieved its stated goals of testing and refining the existing 
predictive model. The refinements took the form of more accurate mapping of soil 
types and of the generation of a discriminant function that permits calculation of the 
probability of encountering a site at any given point on the landscape. Although a 
very low site-density zone is defined, it is neither tied to any significance determi- 
nation nor used as an argument to exclude the area from future surveys. Like many 
of the other correlative or inductive models, this one masks much of the important 
variability in the archaeological record by lumping all prehistoric site types into one 


group. 


Cisco Desert Project A Clas; II Survey and Predictive Model of Selected Areas in the Cisco 
Desert, Grand County, Utab. |. E. Bradley, W. R. Killian, G. R. Burns, and M. A. 
Martorano. Cultural Resources Report No. 10. Goodson & Associates. 1984 


Goodson and Associates, a private consulting firm, conducted the Cisco Desert 
Project for the Bureau of Land Management. The project’s modeling objectives 
were to use existing data to construct a predictive model for the location of site and 
siteless areas and to test the model with results of a sample survey (Bradley et al. 
1984). 


The 32,389 ha project area lies within the Colorado Plateau region of east- 
central Utah and is characterized by desert shrub, greasewood, and juniper wood- 
land vegetation communities. Although the plan was to test an existing model, it 
soon became apparent that the existing model was inadequate for the project area, 
both because the project area had a much higher site density and because sites were 
found in many microenvironmental zones (e.g., dunes and rockshelters) that were 
not present in the areas for which the original model had been developed. The 
solution adopted was to build a model using information from a 5 percent sample 
survey conducted as part of the project, and then to test the model on data collected 
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_ during previous surveys. One hundred 16 ha (40 acre) tracts were selected for 
survey using a simple random sampling technique; an additional 17 tracts were 
selected on what amounted to a judgmental basis. A total of 126 sites were recorded 
within the randomly sampled 1619 ha area; 15 sites were historical and 11! were 
aboriginal, representing early Archaic through protohistc ic occupation of the area. 
Eighty-eight sites (40 lithic scatters and 48 campsites) and 51 siteless locations from 
within the 5 percent sample were employed to construct four discriminant analysis 
models (two for each site type) using either “‘traditional” modeling variables, such 
as slope, distance to water, vegetation, etc., or soil unit variables. The soil unit 
models were found to be more accurate and easier to use. Sensitivity ratings for 
high, medium, low, and unknown (the sample for one soil type was too small for 
predictive purposes) chances of encountering a site were calculated on the basis of 
the various soil units. Soil unit/projected site density values were mapped for the 
entire project area. The overall results were judged to compare favorably with those 
generated from an existing model derived from a 10 percent sample survey of 58,705 
ha in adjacent areas. 


Although the manner in which the models were developed and tested differed 
from the original plan, the overall objective was achieved. More specifically, an 
environmental variable—soil unit—was identified as an accurate predictor of site 
locations, and areas of low site density were delineated for management purposes. 
An obvious shortcoming, however, is what the authors refer to as the lack of an 
adequate data base for making predictions in Soil Unit 9, which constitutes 7.8 
percent of the project area. Too few transects were surveyed in areas with this soil 
unit, and too few sites were discovered in those trar.sects to permit confident 
inferential model construction. 


Bradley et al. (1984:88) draw the reader’s attention to the fact that many of the 
sites misclassified in the discriminant analysis (ca. 15 percent) were in Soil Unit 2 
(48.3 percent of the area and 0.95 sites per mi? in surveyed areas). Many of these sites 
were also located within 1287 m (0.8 mi) of an area of Soil Unit 3 (13.4 percent of the 
area, 27.35 sites per mi”). Given this situation, their recommendation with regard to 
additional survey of Soil Unit 2 areas is as follows: 


If survey requirements in this zone are waived by the BLM, isolated eligible sites may be 
endangered. It is recommended that all areas within .8 mile of soil units 3, 5, 8, and 9 
continue to be surveyed in order to protect these sites and further test the model's 
accuracy. This .8 mile buffer includes sites misclassified by the soils model [Bradley et al. 
1984-96). 


Continued survey in the buffer zone would test the model only in regard to site 
density in the buffer zones; it would not be a te: of whether National Register-eligi- 
ble sites are present in the other portions of Soil Ur't 2. This approach accepts the 
possible loss of an unknown number of sites in approximately 25-30 percent of the 
project area, and it recognizes that some of the sites may be eligible for inclusion in 
the National Register of Historic Places. By its reliance on modern environmental 
distributions, it potentially jeopardizes the opportunity to discover and investigate 
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sites that may have been utilized and or occupied at times in the distant past when 
the desert shrub stratum, including Soil Unit 2, was more like today’s juniper 
stratum (1.¢., Soil Unit 8). 


Route 13 Relief Corridor Project 4 Cultural Resources Reconnaissance Planning Study of 
the Proposed Rt. 13 Relief Corridor, New Castle and Kent Counties, Delaware. |. F . Custer, 
P. Jehle, T. Klatka, and T. Eveleigh. University of Delaware. 1984 


The Route 13 project was funded by the Delaware Department of Transporta- 
tion with the objective of identifying zones within a proposed highway corridor that 
were likely to contain significant prehistoric and or historical resources. The 
project was conducted as an overview planning study by personnel representing 
the Center for Archaeological Research at the University of Delaware (Custer et al. 
1984). 


Wetlands, agricultural lands, and urban areas occupy most of the 64.4 by 11.3 
km project area (ca. 72,772 ha) in north-central Delaware. The predictive model was 
developed within different contexts: one for the environment and the other for 
regional cultural history. It relies heavily on the results of previous overviews. A 
number of site types (e.g., macroband basecamps, procurement sites, quafry sites, 
and industrial, commercial, and transportation sites) were recognized for various 
prehistoric and historical periods. Site types were characterized according to their 
environmental settings, and the information was summarized in tables that repre- 
sent a general locational model. The general model was compared in a narrative 
with the results of a Landsat Odessa terrain analysis (pixel size = 2.3 ha) that 
incorporated site locational information. Logistic regression analysis was used to 
correlate environmental zones with site presence. Maps were produced to illustrate 
known site locations and probability zones for different ages and kinds of prehistoric 
sites. Tables provide information about the relative potential for encountering 
significant historical sites in individual pixels. A separate and very general deduc- 
tive model was developed to predict and explain the distribution of Adena mortuary 
exchange sites. A second series of maps was generated to illustrate the high, 
moderate, and low sensitivity zones in terms of their potential for containing 
significant sites. In essence, high probability zones had the greatest sensitivity and 
the greatest potential for containing significant sites. 


This project considers a wide range of site types in terms of their predicted 
locations and potential significance. The concept of significance is defined in a 
manner such that small, disturbed, and plow zone sites are largely excluded. 
Considerable attention is given to an assessment of the quality (i.e., reliability) of 
the information in existing site files. For most zones the available information is 
rated as “‘poor” or “fair.”’ Given that kind of data the value of developing a series of 
correlative models for a wide variety of prehistoric and historical site types seems 
questionable. High probability zones and/or big sites with large quantities of 
artifacts are viewed as potential National Register properties and small procure- 
ment sites, as well as plow zone sites in general, are considered “not likely”’ to be 
eligible. The authors clearly state that their assessments are preliminary, however. 
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They also note that the data presented should not be viewed as a substitute for site 
location identification surveys anywhere within the project area. Although “‘no. 
specific fieldwork was carned out as part of this study” (Custer et al. 1984:1), some 
of the predictions made in the study were apparently field tested in 1984 and 1985. 
Results from this recent work were not included in the present review effort, but 
according to Custer (n.d.), “field tests of the predictions showed a 9 percent 


accuracy rate.” 


In general, the project fulfills its objective in that it succeeds in identifying 
zones that are likely to contain significant sites. The connecting links among 
regional prehistory history, the existing data base, and the predictive models are 
difficult to follow, however, owing to the somewhat disorganized nature of the 


report. 


Montane Hunter-Gatherer Project Cultural Ecology and Economic Decision M aking of 
Montane Hunter-Gatherers in Central ldabo. Steven Hackenberger. M.A. thesis, 
Department of Anthropology, Washington State University. 1984 


The Montane Hunter-Gather Project is a master’s thesis submitted to 
Washington State University. It was developed with the objectives of (a) determin- 
ing how well proportional resource use by montane hunter-gatherers could be 
predicted by comparing hypothetical decision-making strategies with observed 
resource distributions, and (6) determining whether archaeological data could be 
used to address the problem. The work is a by-product of a 1978 reconnaissance 
survey model-building project (Knudson et al. 1982) funded in part by the Forest 
Service and the Idaho State Historical Society. Information presented here is from 
Hackenberger (1984). 


The 1,216,800 ha project area is drained by the Middle Fork of the Salmon 
River and can be characterized as a forested montane environment with parklands 
and meadows. Environmental data—distribution of vegetation units, yields of 
browse vegetation, and distribution of plant, fish, and ungulate resources in terms 
of available calories for humans— were encoded for 520 23.3 ha (9 mi?) grid units. 
These data were used to develop general predictions for hunter-gatherer settle- 
ment location, proportional resource use, and winter population aggregation. 
LaPlace, Savage, and Wald decision criteria were used in computer simulations to 
model long-term choices of site location based on resource density and yields. 
Ethnographic data provided analogs for modeling economic decision making 
among historical and late prehistoric occupants of the region. 


These analyses indicated that models based on resource distributions or 
changes in distributions were more successful at predicting site location than 
models of various decision-making processes. Preliminary archaeological data were 
compared with predicted settlement locations and population sizes. Some of the 
predictions (e.g., locations of winter village sites) could be supported with available 
archaeological data, but in general, the researcher found that more survey would be 
required to provide data to test the models adequately. 
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This approach to predictive modeling, particularly the aspects that focus on 
monitoring distributions of food resources, is promising because it offers the 
potential for predicting and explaining the distribution of cultural resources. As the 
model now stands, however, its application 1s limited to the time pernods for which 
ethnographic land-use data are available. Since spatial resolution is low and predic- 
tions are difficult to quantify the use of the approach is limited to the carly planning 
stages of cultural resource management. The models are testable, however, and 
with refinement they could become more readily falsifiable. What ts particularly 
promising about the approach 1s that predictive modeling for purposes of cultural 
resource management can be conducted in the context of problem-onented inves- 
tigations that are likely to yield information important in prehistory and history. 


; &, 
Tar Sands Project The Tar Sands Proje. Cultural Resource Inventory and Predictive 
Modeling in Central and Southern Utab. Betsy L. Tipps. P-IIl Associates. 1984 


The Tar Sands inventory modeling project was funded by the Bureau of Land 
Management and carned out by individuals representing P-IIl Associates, an 
archaeological consulting firm based in Salt Lake City. The project's objectives 
included (4) implementation of a5 percent inventory of each tract in the project 
area, (6) development of a site locational model that would correlate environmental 
characteristics with known site locations, (c) inventory of an additional 5 percent of 
each project area tract and use of the resulting data to test and refine the model, (4) 
development of projections of site density distributions and diversity of cultural 
resources based on the results of the 10 percent combined inventory, and (¢) 
definition of the factors that determined cultural resource site selection and have 
explanatory value for predicting the location of sites. Information summarized here 
is from Tipps (1984). 


The 69,635 ha study area lies in the Canyon Lands section of the Colorado 
Plateau and exhibits typical Great Basin vegetation patterns: shadscale, sagebrush, 
and pinon-juniper zones. Two 5 percent simple random samples (with some 
modification) of 65 ha quadrats were drawn for survey purposes from each of four 
large tracts. Including “‘buffer zones,”’ some 7400 ha were surveyed and found to 
contain 155 sites (167 components) as well as a number of isolated finds. The sites 
represent occupations from the early Archaic to the historical periods. Prehistoric 
site density estimates with confidence intervals were made for each tract. Map- 
readable environmental variables were correlated with site locations in three of the 
four tracts using a discriminant analysis applied to data from one 5 percent sample. 
The results of the first analysis were tested and refined using the additional 5 
percent sample data and a set of siteless areas. A final discriminant analysis was 
based on the 10 percent sample. Using six environmental variables —relief, eleva- 
tion, distance to water, distance to nearest river, drainage, and quadrat vegetation 
cover —the analysis correctly classified 71 percent of the quadrats into categories of 
no sites. one site, and two or more sites; when these categories were combined, 93 
percent of the quadrats with sites were classified correctly. Another predictive 
model was generated using Landsat imagery data and cluster analysis to classify the 
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area and provide probability estimates of site occurrence. Its utility for management 
purposes was found to be limited, however, because all the strata had similar 
probabilities of site occurrence. 

This proyect achieved most of its goals, especially those related to the sample 
surveys and to finding correlations between environmental variables and site 
locations. In fact, this study represents one of the more sophisticated and better 
presented versions of the now-famuliar correlative approach to predictive modeling 
(e.g., Larralde and Chandler 1981; Kemrer 1982; Kvamme 1983; Bradley et al. 1984). 
The discrimination of three classes of grid units—those with no sites, those with 
one site, and those with more than one site—may be an improvement over 
approaches that only distinguish between site-present and site-absent quadrats. It 
1s also noteworthy that during the course of fieldwork an effort was made in some 
areas to determine whether there were burned cultural materials. Existing road cuts 
and cutbanks were examined, and a few buried sites were recorded. This practice 
seems advisable in areas noted for their long histories of high rates of erosion (e.g, 
the Southwest and the Great Basin). 


One shortcoming of the discriminant and Landsat models was that the White 
Canyon tract was excluded from the analysis. This exclusion is unfortunate because 
even though this tract represents only 6.1 percent of the project area, it has an 
average density of 2.86 sites per quadrat. The discriminant model and the Landsat 
models are subject to other criticisms frequently made of projects using a correlative 
approach (see Berry 1984), including criticisms of arbitrary distinctions between 
sites and isolated finds. 


The project was much less successful in achieving the goals of defining and 
explaining factors that determine site location. For example, the following partial 
explanation was offered for the success of the discriminant function in distinguish- 
ing quadrats with only one site: 





the single sites un these quadrats generally represent small, lomited activity sites that 
occur in a localized anomalous postion of the quadrat. The quadrats im which these 
isolated sites are found may represent areas where more specialized or muted types of 
activities were occurring such as hunting or plant gathering or material procurement 
For such sites variables such as distance to water, percent of quadrat cover, etc., may not 
be key factors m site location at all. We note, as do previous researchers, that site type isa 
critical factor in understanding the site selection process for prehistoric peoples | Typps 
1984: 158]. 





These statements recognize the problem that lumping site types obscures impor- 
tant differences, but they do not explain why distance to water and vegetation type 
should be less useful for predicting the locations of hunting or vegetal procurement 
sites than for basecamps or other multiple activity sites. These explanations 
assume, as most correlation-based explanations do, that groups who used these 
anomalous quadrats (35.6 percent of all those with sites) for thousands of years all 
did so in essentially the same manner, in spite of significant changes in human 
population densities and technological developments, not to mention climatic 
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changes that surely affected the distribution of food resources. Even if such 
redundancy of land use could be demonstrated, it too would require considerable 


explanation. 


Central Oregon Project Locating Significant Archaeological Sites by Landform 
Analysis in Central Oregon. Leshe E. Wildesen. Draft report submitted to the 
Bureau of Land Management, Oregon State Office and Prineville District 
Office. 1984 


The central Oregon predictive modeling project was funded by the Bureau of 
Land Management and conducted by personnel representing W ildesen Associates, 
a Portland-based archacological consulting firm. The project's objectives were to 
identify lands likely to contain significant prehistoric sites requiring “affirmative 
management action” and to identify lands not likely to retam an important 
archacological record. The purpose of identifying these land categories was to focus 
efforts on sites that are subject to the requirements of the National Histonc 
Preservation Act. Information summarized here was taken from a draft document 
by Wildesen (1984), which was circulated widely for review purposes. 


The overall project encompasses an area comprising almost | millon ha, of 
which 427,787 ha are managed by the BLM. Characteristic vegetation communities 
include sagebrush and grasslands as well as juniper and ponderosa forests. Within 
the larger area, 364 prehistoric sites were documented in existing site files, and 244 of 
these are on BLM land. All sites in the project area and all lands managed by the 
BLM were used to develop the model. The sites were judged to represent the full 
“functional and descriptive” range of site types known from the Desert West. 


The concept of site significance was an important clement of this study. 
Wildesen followed a previously established working definition for the concept of 
“important information,” which was defined as 


substantive new miormation on northern Great Basen settlement or subsistence pat- 
terns, chronology , toolkits or technology, art, or mntercultural relatvons (ynchuding travel 
ot trade) | BILM 1982, cted mm Wildesen 1984-2] 


Wildesen (1984:3) goes on to note that, by implication, significant sites 


will show evidence of more than one kind of use, of more than one use event; 
will contam diagnostic tool types, comparable with existing typologies; 

will exhibit physical wntegnty over more than 90 percent of the surtace area; 
may conta mternally stratuied sediments or cultural layers; 


may conta artifacts or manutacturmg debris, faunal remamms, ot constructed features 


(cairns, pits, pated or pecked rock art panels, or walls); or 


may be related to smular or different sites within a specific geographic area (1.¢., 
comprise part of a National Regwter District) 
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Ethnographic data were employed to sdentify the kinds of landforms used by 
Native Americans for vanous activities. Five landforms were identified as having 
been used ethnographucally aed as having the potential for contaming sediments 
with “high physical integrity.” These landforms are noted as having already 
yielded “archacological sites with substantial scholarly values.” These five land- 
forms, along with two other landforms “known to contain archacolog +. al remains of 
significant interest™ (Wildesen 1984:4), were classified as hugh probabality landform. 
These seven landforms were calculated to represent only 7.3 percent of the project 
area. The model is presented in the form of text, tables, graphs, computer print- 
outs, and maps that dlustrate the locations of high probability landforms. 


Explanations for and potential apphcations of the modeling approach were as 
follows: 


By tocuwng the analy ws on where natural processes are not lkely to have preserwed 
mtact archacologycal evadence, as much as 93 percent of the study area can be removed 
trom the potential data base. Thos dors not mean that some evidence of prehustorn wee 
may not be present on those acres, of that those acres were not used at some teme om the 
past. It does mean that evidence of use ws hkely to be deturbed, moonchuwwe, «~ meseng 
enterely trom the record. Under such corcumstances, a » very wnlskely that the archaco- 
logscal values of amy sites located on these actes will warrant substantul archacolegecal 
resource Management activity, of will require sagneficant eflert to resalve confhcts wath 
other resource management actewitecs | Wildesen 1984-56} 


This proyect succeeded mm identifying lands likely to contain significant prehis- 
tonic sites, but the methods used to accomplish these zoals are problematic. First, 
there seems to have been no system.tic attempt to evaluate the quality of data in 
the site files. If the Oregon site files are similar to those in other parts of the United 
States, one might suspect that they need to be “cleaned” before being used to 
construct models. Second, the model rehes heavily on ethnographic analogy. Use of 
ethnographic information to define the areas habitually exploited by human groups 
for thousands of years and during different climatic regimes seems to be of lumited 
value. 


Of greater concern is the approach to defining “significant” sites. Categorical 
criteria are established for defining significance, and they virtually exclude small, 
disturbed, and plow zone sites, which have long been argued to be potentially 
significant (Talmage et al. 1977). Furthermore, the criteria do not acknowledge the 
potential for some site types (¢.g., task-specific sites or small residential ites that 
might have been disturbed by natural processes) to contribute mmportant mforma- 
tion regarding significant research topics (¢.g., land use systems for mid-Holocene 
hunter-gatherers). Removal of a 396,559 ha area encompassing an undetermined 
number of cultural resources from the potential data base may be premature, 
especially if this 1s done on the basis of existing, but unevaluated, survey data. The 
significance criteria outlined in Wildesen (1984) imply that significance is related 
directly to the degree to which an archacological site can be considered to encapsu- 
late an undistorted view of the past. Binford has responded to those who share this 
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expectation by noting that “secking a reconstructed Pompeu 1s an unrealistec and 
unprofitable goal mm the light of knowledge we have and the data available to us in 
{the archacological] record” (Binford 1981:206). It may be possible to construct a 
predictive model that can be used to “write off” areas because they contain only 
imsignificant sites, but un the draft document summarized here, Wildesen (1984) 
does not present a convincing argument that the data base in question 1s adequate 


for this purpose. 


CONCLUDING COMMENTS 


Thus survey of predictive locational models is intended to present information 
on a range of approaches to predictive models in different areas and for different 
kinds of cultural resources. This appendix dulers from the other sections of this 
volume im that « 1s a sample inventory of what has been and is being done in 
predictive modeling; it 1s not an evaluation of how predictive modeling 1s expected 
to be done or how it should be done in the future. The concluding paragraphs mm the 
synopses of the projects are narrative assessments of how well the projects achieved 
stated objectives and, as such, are more judgmental than descriptive. 

The goals of this survey were (4) to summarize projects representative of the 
known range of variation in approaches, geographic settings, and types of resources 
being modeled; (6) to provide a descriptive summary and assessment of the 
individual models; (c) to present data that facilitate comparisons among the differ- 
ent models; and (4) to provide enough information to permit the reader to make an 
independent assessment of the predictive locational modeling approaches 
reviewed. Although this survey was not designed to be a synthetic statement 
concerning predictive modeling, nor a critical review of individual proyects, does 
seem appropriate to end with a few comments of a more synt/setic nature. Those 
offered here are based mainly on the detailed examination of these 22 proyect reports 
and on a perusal of many others. 


The following discussion is intended to address two general questions. Do 
existing models contribute substantially to the management of potentially signifi- 
cant, nonrenewable cultural resources? And do they contribute information mmpor- 
tant to our understanding of history or prehistory? It is clear that some of the 

¢ models contribute information important to history or prehistory. Those 
with the potential for explaining aspects of human behavior are likely to be of 
special interest to archacologists. Other predictive models provide probability 
estimates for encountering a particular kind of site at a specific place on the 
landscape, and that information is of special interest to land managers charged with 
protecting significant sites. None of the models assessed here have both explained 
significant aspects of human behavior aad predicted the probability of finding 
evidence of specific behavioral patterns at specific places on the landscape, however. 
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Given the widespread perception that cultural resource management and 
research goals are separate and not especially compatible, this lack of models that 
meet both goals may not be surprising. It 1s not inevitable, however, because 
predictive modeling has the potential to contribute information important to both 
managing and understanding cultural resources. Granted that predictive modeling 
has not been perfected, what has i contributed during the past decade? 

In the first place, more sites are being discovered and documented in a wider 
range of environmental settings than would have been the case 10 years ago. This 1s 
partially because sample surveys that provide the data base for predicting the total 
number of sites are often designed specifically to detect the range of site types in 
different settings. At the same time, there is an increased awareness that a high 
proportion of the extant archacological materials is likely to be found i a small 
proportion of the landscape. Conversely, there is recognition of the potential that 
important cultural resources will be discovered within those portions of the land- 
scape with lower site densities. Furthermore, it 1s becoming clear that there are few, 
if any, areas without any evidence of utilization by human groups. These contribu- 
tions mean that cultural resource specialists, whether managers or archacologists, 
are in a position to better understand the nature of cultural resources im a given area 
and the distribution of different kinds of archacological materials on the landscape. 

Development and use of predictive models also has focused attention on the 
interreiationships between environmental factors and site locations. The search for 
significant spatial correlations has identified many key environmental variables 
useful in predicting site locations. By knowing which environmental settings are 
likely to have certain kinds of sites, managers can determine how those areas can be 
managed with minimal effect on cultural resources. The correlations also provide 
data bases useful in assessing site function and testing models about land-use 
systems. Inclusion of information about paw environmental settings is hkely co be 
particularly useful in understanding how and why prehistoric groups used the 
landscape om a particular fashion. 

Another contribution of predictive modeling has been the compilation of 
quantitative, as opposed to qualitative, data bases. With imformation on the 
estimated density and distnbution of cultural resources, land managers can develop 
more effective plans for the long-term conservation of significant cultural resources. 
Given rehable survey methods and quantitative results, mter- and mtraregyonal 
comparisons of site distributions can be made, along with comparisons of densities 
ot other measures of the intensity of use. In turn, the data from these comparisons 
are useful om testing models about many aspects of past human behavior. 

The predictive modeling approach has also resulted mm a number of trends that 
may not contribute substantially to the acqursstion of mmportant information about 
history or prehistory. Some of the trends may actually hamper the well-informed 
management of nonrenewable cultural resources. Of potential concern are the 


models that provide probability estimates for encountering a gener site — one that 
could be of any type or age —at a particular pomt on the landscape. The generic site 


approach can umply that all ses are of equal umportance, when clearly they are not. 
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Land managers must protect only the significant ones. This suggests the need to 
become more discrummnating about what 1s being predicted. 


Although predictive modeling has focused attention on the interrelationship 
between environmental factors and site locations, there 1s considerable variation 


among environmental variables that ostensibly predict site locat.ons. Among the 
more common predictors are specific values for vegetation type, proximity to water, 
landform, solar exposure, soil type, slope, and elevation. Site locations and behav- 
soral patterns that led to the deposition of matenals probably correlate spatially 
with many other key environmental factors. Regrettably, the reader 1s often left 
with no information as to the significance and explanatory value of these correla- 
tions. The umportance of correlations 1s manifested in thei ability to predict site 
locations, especially those judged to be significant in terms of National .cgister 
criteria. In turn, site significance 1s determined by the resource's potential to 
contribute umportant information. That determination often requires understand- 
ing of why environmental variables correlate highly with site locations and or with 
the kinds of human behavior that account for the site locations. 


identifying key environmental variables without explaming how and why they 
correlate with site location 1s tantamount to making predictions m a cultural and 
behavioral vod. A review of the project summanes presented here dlustrates a 
tendency to predict where sites should be found without adequately addressing the 
question of how humans used the environment. There 1s little discussion about 
relationships between the nature and distribution of basic food and nonfood 
resources on the one hand and complex human land-use systems on the other. A 
detailed study of some predictive models might convince the reader that the 
primary goal 1s to predict the distribution and density of prehistoric things on the 
landscape. Such predictions may be useful, but usually only im conjunction with 
other data that allow greater discrimination among the things pre:iicted. 


The tendency m many predictive models to avoid explanatu.n and to make 
predictions mm a cultural and behavioral void probably is related to a trend toward 
development and utilization of new technologies. Computers are the focal pont of 
the new technologies because many of : sc modeling approaches depend on complex 
statistics and massive data files. GIS and Landsat are examples of new technologies 
that facilitate rehable powt predictions. There is a danger, however, that these 
technologies could become the end product, rather than serving as a source of 
information useful im managing and understanding significant cultural resources. 
Given an emphasis on new technologies and the finite amount of tueme and moncy 
allocated to cultural resource management projects, there seems to be little tume to 
study why the archaeological record appears as ut does. Natural and cultural 
transformation processes are seldom discussed, and examination of human land-use 
systems us the exception rather than the rule mm the predictive models reviewed 
here. It should be recognized, however, that the use of GIS, Landsat, and multivar- 
sate statistics 1s relatively new in predictive locational modeling. As with many new 
technologies, they can be expected to be used as a means to more in ormative ends 


as the science of predictive modeling matures. 
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In some cases there ss an overrehance on current vegetation and young 
landforms to predict the occurrence of sites. An example would be the presence of 
sand dunes formed 4000 years ago as predictors of locations occupied by people 5000 
years ago. Although these two events could be related, the underlying mechanisms 
are seldom discussed. Equally bewildering would be the significance of high positive 
correlations between the location of a 4000-year-old pinon-juniper forest and that of 
a hunting site occupsed 6000 years ago, when the area may have been dominated by 
greasewood and sagebrush. It would seem more appropriate to identify environ- 
mental variables that are useful in predicting site locations and explaining the 

Sand dunes, forests, and other aspects of the environment often act to bury or 
obscure cultural materials. Although this statement is an axiom to cultural resource 
specialists, most predictive models are not concerned with the discovery of buried 
or otherwise obscured sites. Discussions about depositional processes and the ages 
of landforms are seldom included in predictive models. In general, there 1s a paucity 
of discussions about the visibility of cultural materials on the surface, and discus- 
sions of survey methods rarely include a section on techniques used to find buried 
sites. Only a few of the models reviewed here address the relationship between the 
theoretically expected range of site types and the range of site types recorded in the 
region or in specific survey areas. Fluvial and acolian processes clearly act to bury 
older sites in many areas, and forest litter obscures hundreds of sites in other areas. 
If predictive modeling is designed to provide useful information on the distribution 
and density of all site types, the models should incorporate information on deposi- 
tional and erosional processes and their effect on the archacological record. 


Another factor that limits the potential contributions of predictive modeling ts 
an overreliance on the ethnographic record in predicting prehistoric site distribu- 
tions. Investigators often assume that the settlement and subsistence patterns 
documented in the ethnographic record are manifested throughout the archacolog- 
ical record. In other words, the investigators assume that by knowing something 
about settlement and subsistence patterns during the “ethnographic present” they 
also know where people camped and what they ate during the previous millennia. 
Detailed discussions of the time depth for the ethnographic pattern are uncommon. 
There are equally few in-depth discussions about the kinds of land-use systems that 
may have operated before human populations reached historical levels, or before 
they were decimated by European diseases, or before the density and distribution of 
large land mammals were reduced by environmental factors and or human agents. 
Seen from this perspective, the ethnographic record may not provide information 
useful in predicting the locations of sites representative of land-use systems with 
very different settlement and subsistence patterns. In fact, overreliance on the 


ethnographuc record 1s likely to inhibit detection of the range of site types present in 
the archacological record. 

Finally, there may be a growing tendency to “write off” large tracts of land by 
not recommending an inventory-level survey. Although the sample of models 
summarized in this appendix is not statistically representative of the universe of 
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predictive locational models, st 1s informative to note that about 23 percent of them 
include statements that exther open the door to “writing off” large tracts of land or 
actually recommend n. None of the reports written pnor to 1980 make such 
recommendations, but at least one report written that year makes that emphcatvon 
Two such recommendations were made m 1983, and two others m 1984. Whether or 
not there us hard evidence for a growing tendency toward such recommendations 1s 
debatable, and im any case there may be justifications for some of those recommen- 
dations. 


The deaision to not recommend an mventory survey is usually made on the 
basis of sample survey data and or information drawn from a review of available 
site-file data. Areas are usually written off because no sites are expected to occur 
there or because those that do occur there are not expected to be significant. The 


main problem with this procedure is that the reliability of the data base used for 
making the recommendation 1s usually questionable. The relability of the data base 


depends upon the soundness of survey methods and or upon the approach used to 
determine site sigmificance. A second problem is that recommendations to write off 
an area without conducting an inventory survey tend to be based on the distribu- 
tion of sites of known types, sites that were discovered using methods designed to 
find the best-known kinds of sites. This approach does not encourage the discovery 
of unknown but theoretically expected site types; rather, focuses on refining 
established models. Generally , this encourages additional discovernes of sites of the 
best represented kinds at the expense of older sites and site types that are not 
readily visible on ~he surface. Exempting large areas of the landscape trom imven- 
tory survey without assessing the reliability of the data base has the potential of 
ensuring that the range of site types remains undocumented. 


The use of data generated by predictive locational models to legitimize 
no-survey recommendations is of particular concern because of the nature of 
cultural resources. Cultural resources are potentially important to many people for 
many different reasons, and they are sesreacwahlc. Once nonrenewable cultural 
resources are written off, they are likely to be excluded from further study, 
regardless of the validity of the rationale for this recommendation. In fact, the 
legality and ethics of writing off resources, especially on the basis of dubious data, 
now being questioned. This is evidenced by lawsuits being brought agains agen- 
cies that have cleared areas contaming archacological materials and by the mcreas- 
ing national dialog among archacologists about this subject (Darse and Keyser 1985; 
Tamter 1984). 

Overall «aie 1s considerable vanability in approaches among predictive mod 
els, both among those conducted im the systemic context and among those carned 
out mm the analytic context. All of the models revrewed here were developed to 
provide information useful m the management of significant, nonrenew able cult ural 
resources and of information umportant to our understanding of history or prehus- 
tory. Although there are examples of models that provided information of special 
use to land managers and of models useful in explamung aspects of human behavior, 
none of the models assessed here were successful in providing both kinds of 
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information. Even so, st seems clear that predictive modeling, a5 used m cultural 
resource management, has the potranal to provide both kinds of information. Given 
the relate recency of predictive locational modeling as a scocntiixe approach m 
cultural resource management, borb aur: aad shee: of tt should be expected (Ambler 
1984). Lakewsse, « should be antacepated that the potential to contribute a wade 
range of useful information will be realied as the scence of predictive modchng 
matures. Thus volume was demgned to ¢ rovide the reader with miormatwn about 
how that potential mught be reahzed. 
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